<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[pganalyze Blog - Articles by Lukas Fittl]]></title><description><![CDATA[Monitoring Postgres and tuning query performance]]></description><link>https://pganalyze.com</link><generator>GatsbyJS</generator><lastBuildDate>Thu, 07 May 2026 18:37:12 GMT</lastBuildDate><atom:link href="https://pganalyze.com/feed/lukas-fittl.xml" rel="self" type="application/rss+xml"/><item><title><![CDATA[Waiting for Postgres 19: Reduced timing overhead for EXPLAIN ANALYZE with RDTSC]]></title><description><![CDATA[In today’s E122 of “5mins of Postgres” we're talking about the upcoming Postgres 19 release, and how a change in the Postgres instrumentation handling reduces overhead of timing measurements in EXPLAIN ANALYZE using the RDTSC instruction, and why this will allow turning on  for more workloads. We dive into the recently committed change that I (Lukas) authored together with Andres Freund and David Geier. See the full transcript with examples below. Share this episode: Click here to share this…]]></description><link>https://pganalyze.com/blog/5mins-postgres-19-reduced-timing-overhead-explain-analyze</link><guid isPermaLink="false">https://pganalyze.com/blog/5mins-postgres-19-reduced-timing-overhead-explain-analyze</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Sat, 11 Apr 2026 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;In today’s E122 of “5mins of Postgres” we&apos;re talking about the upcoming Postgres 19 release, and how a change in the Postgres instrumentation handling reduces overhead of timing measurements in EXPLAIN ANALYZE using the RDTSC instruction, and why this will allow turning on &lt;code &gt;auto_explain.log_timing&lt;/code&gt; for more workloads.&lt;/p&gt;
&lt;p&gt;We dive into the &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=294520c44487ecaade7a6ea8781b973f9ed03909&quot;&gt;recently committed&lt;/a&gt; change that I (Lukas) authored together with Andres Freund and David Geier. See the full transcript with examples below.&lt;/p&gt;
&lt;iframe
    width=&quot;750&quot;
    height=&quot;421&quot;
    src=&quot;https://www.youtube-nocookie.com/embed/4EgdLMxkCrE&quot;
    frameborder=&quot;0&quot;
    modestbranding=&quot;1&quot; controls=&quot;0&quot; allownetworking=&quot;internal&quot;
    allow=&quot;autoplay; encrypted-media&quot;
    allowfullscreen
&gt;
&lt;/iframe&gt;
&lt;br /&gt;&lt;br /&gt;
&lt;p&gt;&lt;strong&gt;Share this episode:&lt;/strong&gt; Click here to share this episode &lt;a href=&quot;https://www.LinkedIn.com/shareArticle?mini=true&amp;#x26;url=https://pganalyze.com/blog/5mins-postgres-19-reduced-timing-overhead-explain-analyze&amp;#x26;title=Waiting%20for%20Postgres%2019%20Reduced%20timing%20overhead%20for%20EXPLAIN%20ANALYZE%20with%20RDTSC&amp;#x26;source=LinkedIn&quot;&gt;on LinkedIn&lt;/a&gt;. Feel free to &lt;a href=&quot;https://pganalyze.com/newsletter&quot;&gt;sign up for our newsletter&lt;/a&gt; and &lt;a href=&quot;https://www.youtube.com/channel/UCDV_1Dz2Ixgl1nT_3DUZVFw&quot;&gt;subscribe to our YouTube channel&lt;/a&gt;.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#the-problem-of-slow-timing-measurements&quot;&gt;The problem of slow timing measurements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#rdtsc-vs-rdtscp&quot;&gt;RDTSC vs RDTSCP&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#the-new-timing_clock_source-postgres-setting&quot;&gt;The new timing_clock_source Postgres setting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#live-demo-on-postgres-19-development-branch&quot;&gt;Live demo on Postgres 19 development branch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#what-we-have-discussed-in-this-episode-of-5mins-of-postgres&quot;&gt;What we have discussed in this episode of 5mins of Postgres&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;strong&gt;Transcript&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Welcome back to 5mins of Postgres! Today we talk about a change in the upcoming Postgres 19 release that will lower timing overhead for EXPLAIN ANALYZE.&lt;/p&gt;
&lt;p&gt;This is a change that I contributed myself together with Andres Freund and David Geier, and we&apos;ve worked on this change for a couple of years now actually. But in this release, we basically sat down and we really figured out all the little details that make this work. Now, this was committed recently to the Postgres 19 development branch, and to be clear, it might still be taken out of the final release if any issues are found, but right now, I think there&apos;s a decent chance it stays in.&lt;/p&gt;
&lt;p&gt;Postgres 19 will be released in September or October, and feature freeze just happened and the beta release will come out sometime in May this year. Now let me show you a little bit more about what this change is about.&lt;/p&gt;
&lt;h2 id=&quot;the-problem-of-slow-timing-measurements&quot; &gt;&lt;a href=&quot;#the-problem-of-slow-timing-measurements&quot; aria-label=&quot;the problem of slow timing measurements permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The problem of slow timing measurements&lt;/h2&gt;
&lt;p&gt;Back in 2020, &lt;a href=&quot;https://www.postgresql.org/message-id/flat/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de&quot;&gt;Andres Freund started a mailing list thread&lt;/a&gt; where he was basically saying when you run EXPLAIN ANALYZE on a query, it looks a lot slower than it actually is. So in this example here, Andres created a table with 50 million rows:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; lotsarows&lt;span &gt;(&lt;/span&gt;&lt;span &gt;key&lt;/span&gt; &lt;span &gt;int&lt;/span&gt; &lt;span &gt;not&lt;/span&gt; &lt;span &gt;null&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; lotsarows &lt;span &gt;SELECT&lt;/span&gt; generate_series&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;50000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
VACUUM FREEZE lotsarows&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Very simple table, and then he ran a &lt;code &gt;COUNT(*)&lt;/code&gt; on that table:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;SELECT count(*) FROM lotsarows;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If I run the &lt;code &gt;COUNT(*)&lt;/code&gt; without any EXPLAIN, I get a run time of about 1,900 milliseconds. If I run, EXPLAIN ANALYZE with TIMING OFF and back in that release also with BUFFERS OFF, I get a runtime of about 2,300 milliseconds. Now, if I turn TIMING ON the runtime more than doubles from the actual time. Instead of my query taking 1,900 milliseconds, the query now takes 4,200 milliseconds:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;-- best of three:
SELECT count(*) FROM lotsarows;
Time: 1923.394 ms (00:01.923)

-- best of three:
EXPLAIN (ANALYZE, TIMING OFF) SELECT count(*) FROM lotsarows;
Time: 2319.830 ms (00:02.320)

-- best of three:
EXPLAIN (ANALYZE, TIMING ON) SELECT count(*) FROM lotsarows;
Time: 4202.649 ms (00:04.203)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And first of all, that&apos;s a problem because it skews what my actual performance is. If I&apos;m doing testing with EXPLAIN ANALYZE, and I don&apos;t recognize that timing has overhead, I basically think my query is slower than it actually is. The other issue is that if you run auto_explain, usually we recommend people turn log_timing off. Just for example, here in pganalyze&apos;s install instructions, we like recommending people to use auto explain, but we always tell people today to turn timing off because we think that this is not safe to use on most production systems without knowing your workload better.&lt;/p&gt;
&lt;p&gt;If we look at the problem here in more detail, Andres basically did a little profile here and he looked at where is that overhead coming from?&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;-   95.49%     0.00%  postgres     postgres                 [.] agg_retrieve_direct (inlined)
   - agg_retrieve_direct (inlined)
      - 79.27% fetch_input_tuple
         - ExecProcNode (inlined)
            - 75.72% ExecProcNodeInstr
               + 25.22% SeqNext
               - 21.74% InstrStopNode
                  + 17.80% __GI___clock_gettime (inlined)
               - 21.44% InstrStartNode
                  + 19.23% __GI___clock_gettime (inlined)
               + 4.06% ExecScan
      + 13.09% advance_aggregates (inlined)
        1.06% MemoryContextReset&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;rdtsc-vs-rdtscp&quot; &gt;&lt;a href=&quot;#rdtsc-vs-rdtscp&quot; aria-label=&quot;rdtsc vs rdtscp permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;RDTSC vs RDTSCP&lt;/h2&gt;
&lt;p&gt;So first of all, in that profile we see the InstrStartNode and InstrStopNode calls. So those are basically calls that get added by Postgres when instrumentation is on, so when I&apos;m running an EXPLAIN ANALYZE, and we can see that most of that time is spent in the clock_gettime function. On a modern Linux system, this is not actually a syscall. Instead, it directly calls &lt;code &gt;RDTSCP&lt;/code&gt;. &lt;code &gt;RDTSCP&lt;/code&gt; is basically a special instruction on the CPU that gets what&apos;s called the timestamp counter.&lt;/p&gt;
&lt;p&gt;And think of the timestamp counter as a value that keeps going up, that basically counts cycles, but it counts cycles in a way that isn&apos;t influenced by power level changes or other issues that might cause it to be skewed. So it&apos;s actually pretty reliable. Now the problem is that what &lt;code &gt;RDTSCP&lt;/code&gt; does is it waits until all prior instructions have finished and we say instructions we mean CPU instructions. And so basically what happens is that the timing itself is not just getting the time, but it&apos;s also blocking other activity from occurring.&lt;/p&gt;
&lt;p&gt;It&apos;s blocking the CPU from basically running things in parallel effectively. Now, there is a different instruction called &lt;code &gt;RDTSC&lt;/code&gt; without the P. And this instruction basically does not have this blocking of other concurrent instructions. And so when you have this in the picture, then it actually drastically lowers the performance overhead of the timing.&lt;/p&gt;
&lt;p&gt;In this particular example Andres ran at the time, instead of the query taking 4,200 milliseconds, it actually took only 2,600 milliseconds:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                          QUERY PLAN                                                           │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=846239.20..846239.21 rows=1 width=8) (actual time=2610.235..2610.235 rows=1 loops=1)                         │
│   -&gt;  Seq Scan on lotsarows  (cost=0.00..721239.16 rows=50000016 width=0) (actual time=0.006..1512.886 rows=50000000 loops=1) │
│ Planning Time: 0.028 ms                                                                                                       │
│ Execution Time: 2610.256 ms                                                                                                   │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)

Time: 2610.589 ms (00:02.611)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This was mainly a prototype at the time. So a lot of the complexities, and part of the reason why this took so long to get implemented is because we needed to make sure that this worked in all kinds of different systems that Postgres gets used on.&lt;/p&gt;
&lt;h2 id=&quot;the-new-timing_clock_source-postgres-setting&quot; &gt;&lt;a href=&quot;#the-new-timing_clock_source-postgres-setting&quot; aria-label=&quot;the new timing_clock_source postgres setting permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The new timing_clock_source Postgres setting&lt;/h2&gt;
&lt;p&gt;One of the things we ended up adding based on discussions on the mailing lists is a new setting to control whether this gets used or not. So with the &lt;a href=&quot;https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-TIME&quot;&gt;new &quot;timing_clock_source&quot; setting&lt;/a&gt;, you basically control whether you automatically use the TSC clock source on x86-64 CPUs that are modern enough that have the right instructions. You can force the old way of using the system clock, or you can explicitly set the TSC clock source.&lt;/p&gt;
&lt;p&gt;Now in Postgres, we&apos;re now basically splitting into two different use cases. So for things like EXPLAIN ANALYZE where we don&apos;t necessarily care about a very short, exactly precise measurement, like it&apos;s more about the cumulative time that gets taken we use the RDTSC instruction versus in other cases where we care about the higher precision, and it&apos;s still a short, run time we do use the RDTSCP instruction, which is higher overhead. Now there is a lot of supporting code to make this work in different environments, if you&apos;re interested in how that works, look at the &quot;instr_time.c&quot; file.&lt;/p&gt;
&lt;h2 id=&quot;live-demo-on-postgres-19-development-branch&quot; &gt;&lt;a href=&quot;#live-demo-on-postgres-19-development-branch&quot; aria-label=&quot;live demo on postgres 19 development branch permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Live demo on Postgres 19 development branch&lt;/h2&gt;
&lt;p&gt;I want to show you an actual example of how this improvement now looks like in the 19 branch. So here I have an SSH client because my machine right now actually is a MacBook. And this initial release will only be focused on getting the fast timing in for x86-64. ARM has a similar instruction, but there is some outstanding issues for ARM machines. So right now I&apos;m connected here via SSH to a different machine. This machine sits right next to me, it&apos;s this little Framework Desktop here, but that one is an x86 machine.&lt;/p&gt;
&lt;p&gt;And so now what I can do here is I have my Postgres branch already built. I&apos;m first going to run the pg_test_timing utility, it basically measures that overhead of timing. Now here we get three different measurements:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;System clock source: clock_gettime (CLOCK_MONOTONIC)
Average loop time including overhead: 18.80 ns
Histogram of timing durations:
   &amp;lt;= ns   % of total  running %      count
       0       0.0000     0.0000          0
       1       0.0000     0.0000          0
       3       0.0000     0.0000          0
       7       0.0000     0.0000          0
      15      12.7533    12.7533   20353931
      31      87.2357    99.9890  139225930
...

Clock source: RDTSCP
Average loop time including overhead: 16.94 ns
Histogram of timing durations:
   &amp;lt;= ns   % of total  running %      count
       0       0.0000     0.0000          0
       1       0.0000     0.0000          0
       3       0.0000     0.0000          0
       7       0.0000     0.0000          0
      15      31.1807    31.1807   55204578
      31      68.8159    99.9966  121836600
...

Fast clock source: RDTSC
Average loop time including overhead: 11.69 ns
Histogram of timing durations:
   &amp;lt;= ns   % of total  running %      count
       0       0.0000     0.0000          0
       1       0.0000     0.0000          0
       3       0.0000     0.0000          0
       7       0.0000     0.0000          0
      15      83.5188    83.5188  214321443
      31      16.4789    99.9977   42287217
...

TSC frequency in use: 2993629 kHz
TSC frequency from calibration: 2994357 kHz
TSC clock source will be used by default, unless timing_clock_source is set to &apos;system&apos;.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We get the built in clock source called &lt;code &gt;clock_gettime&lt;/code&gt;. That took 18 nanoseconds to get a time measurement. Now we&apos;re checking with &lt;code &gt;RDTSCP&lt;/code&gt;, which again, blocks out of order instructions. That one takes 16.9 nanoseconds. And then if we&apos;re running with &lt;code &gt;RDTSC&lt;/code&gt;, it takes 11.6 nanoseconds. So clearly &lt;code &gt;RDTSC&lt;/code&gt; has less overhead here, I&apos;m getting 50% benefit in this test timing program. I also see which frequency gets used, and then I also see whether that new clock source will used by default. If I don&apos;t want to use it, I would have to set &lt;code &gt;timing_clock_source&lt;/code&gt; to &lt;code &gt;system&lt;/code&gt; explicitly.&lt;/p&gt;
&lt;p&gt;The only reason why that would make sense by the way, is if for some reason your TSC is emulated in a certain way so the timing measurements are not stable. And then &lt;code &gt;timing_clock_source = system&lt;/code&gt; might provide you those stable measurements.&lt;/p&gt;
&lt;p&gt;Now I can run a psql client, show you the actual example. I already have that table that Andres created as an example here as well. First of all, I&apos;ll turn on &lt;code &gt;\timing&lt;/code&gt;. This is on the psql side, just gives me the run time. Now I&apos;m doing a &lt;code &gt;SELECT COUNT(*)&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres=# SELECT count(*) FROM lotsarows;
  count   
----------
 50000000
(1 row)

Time: 268.466 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is a more modern machine, so this takes the same 50 million rows, just goes a little faster. So I have about 260 - 270 milliseconds of runtime here.&lt;/p&gt;
&lt;p&gt;If I run with &lt;code &gt;EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF)&lt;/code&gt;, let&apos;s start with that. I&apos;m not doing a lot of extra work really. I&apos;m just counting how many rows got returned:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres=# EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT count(*) FROM lotsarows;
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=482655.97..482655.98 rows=1 width=8) (actual rows=1.00 loops=1)
   -&gt;  Gather  (cost=482655.75..482655.96 rows=2 width=8) (actual rows=3.00 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -&gt;  Partial Aggregate  (cost=481655.75..481655.76 rows=1 width=8) (actual rows=1.00 loops=3)
               -&gt;  Parallel Seq Scan on lotsarows  (cost=0.00..429572.40 rows=20833340 width=0) (actual rows=16666666.67 loops=3)
 Planning Time: 0.174 ms
 Execution Time: 297.043 ms
(8 rows)

Time: 297.535 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That&apos;s pretty simple.&lt;/p&gt;
&lt;p&gt;And then if I now turn &lt;code &gt;TIMING ON&lt;/code&gt; this is with the TSC clock source, I get a measurement of about 350 milliseconds:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres=# EXPLAIN (ANALYZE, TIMING ON, BUFFERS OFF) SELECT count(*) FROM lotsarows;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=482655.97..482655.98 rows=1 width=8) (actual time=349.687..351.719 rows=1.00 loops=1)
   -&gt;  Gather  (cost=482655.75..482655.96 rows=2 width=8) (actual time=349.606..351.709 rows=3.00 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -&gt;  Partial Aggregate  (cost=481655.75..481655.76 rows=1 width=8) (actual time=347.932..347.933 rows=1.00 loops=3)
               -&gt;  Parallel Seq Scan on lotsarows  (cost=0.00..429572.40 rows=20833340 width=0) (actual time=0.149..201.918 rows=16666666.67 loops=3)
 Planning Time: 0.186 ms
 Execution Time: 351.773 ms
(8 rows)

Time: 352.171 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I&apos;m still seeing, I would say about a 20 - 25% overhead here. So it&apos;s not free, but it&apos;s substantially better than with the system clock source.&lt;/p&gt;
&lt;p&gt;If I do &lt;code &gt;SET timing_clock_source = system&lt;/code&gt;, and I do the timing again, you see a drastic difference:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;SET timing_clock_source = &apos;system&apos;;
EXPLAIN (ANALYZE, TIMING ON, BUFFERS OFF) SELECT count(*) FROM lotsarows;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=482655.97..482655.98 rows=1 width=8) (actual time=799.624..801.496 rows=1.00 loops=1)
   -&gt;  Gather  (cost=482655.75..482655.96 rows=2 width=8) (actual time=799.535..801.488 rows=3.00 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -&gt;  Partial Aggregate  (cost=481655.75..481655.76 rows=1 width=8) (actual time=797.885..797.887 rows=1.00 loops=3)
               -&gt;  Parallel Seq Scan on lotsarows  (cost=0.00..429572.40 rows=20833340 width=0) (actual time=0.073..417.005 rows=16666666.67 loops=3)
 Planning Time: 0.115 ms
 Execution Time: 801.529 ms
(8 rows)

Time: 801.979 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Just for clarity, if I just did a regular select count star here, it would take me 260 milliseconds to run the actual query:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres=# SELECT count(*) FROM lotsarows;
  count   
----------
 50000000
(1 row)

Time: 263.824 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And with the old timing clock source, I get a run time of 800 milliseconds. Versus with the new TSC clock source, I get 355 milliseconds:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;SET timing_clock_source = &apos;tsc&apos;;
EXPLAIN (ANALYZE, TIMING ON, BUFFERS OFF) SELECT count(*) FROM lotsarows;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=482655.97..482655.98 rows=1 width=8) (actual time=353.401..355.238 rows=1.00 loops=1)
   -&gt;  Gather  (cost=482655.75..482655.96 rows=2 width=8) (actual time=353.292..355.229 rows=3.00 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -&gt;  Partial Aggregate  (cost=481655.75..481655.76 rows=1 width=8) (actual time=351.081..351.082 rows=1.00 loops=3)
               -&gt;  Parallel Seq Scan on lotsarows  (cost=0.00..429572.40 rows=20833340 width=0) (actual time=0.131..200.584 rows=16666666.67 loops=3)
 Planning Time: 0.150 ms
 Execution Time: 355.291 ms
(8 rows)

Time: 355.690 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So a drastic difference, and I think this to me also makes a difference for many systems where I would feel comfortable using auto_explain on with log_timing on just because, most queries are not this extreme. To be clear, many realistic queries have much less repetition over just these instrumentation start and stop functions.&lt;/p&gt;
&lt;p&gt;Previously you would&apos;ve seen 5-10% on average, now you&apos;ll probably see 2-3% on average, which for many systems is a good trade off to have the full instrumentation data available in auto_explain.&lt;/p&gt;
&lt;p&gt;There&apos;s many other new features that are coming up, hear some more about that in upcoming episodes.&lt;/p&gt;
&lt;p&gt;I hope you learned something new from E122 of 5mins of Postgres. Feel free to &lt;a href=&quot;https://www.youtube.com/channel/UCDV_1Dz2Ixgl1nT_3DUZVFw&quot;&gt;subscribe to our YouTube channel&lt;/a&gt;, &lt;a href=&quot;https://pganalyze.com/newsletter&quot;&gt;sign up for our newsletter&lt;/a&gt; or &lt;a href=&quot;https://www.linkedin.com/company/pganalyze/&quot;&gt;follow us on LinkedIn&lt;/a&gt; to get updates about new episodes!&lt;/p&gt;
&lt;h2 id=&quot;what-we-have-discussed-in-this-episode-of-5mins-of-postgres&quot; &gt;&lt;a href=&quot;#what-we-have-discussed-in-this-episode-of-5mins-of-postgres&quot; aria-label=&quot;what we have discussed in this episode of 5mins of postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What we have discussed in this episode of 5mins of Postgres&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=294520c44487ecaade7a6ea8781b973f9ed03909&quot;&gt;Postgres 19 commit - instrumentation: Use Time-Stamp Counter on x86-64 to lower overhead&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/message-id/flat/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de&quot;&gt;Postgres pgsql-hackers mailinglist discussion: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-TIME&quot;&gt;The new timing_clock_source Postgres setting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/postgres/postgres/blob/master/src/common/instr_time.c&quot;&gt;Timing instrumentation in instr_time.c&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/docs/explain/setup/amazon_rds/03_review_settings&quot;&gt;Recommended auto_explain settings by pganalyze&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[The Dilemma of the ‘AI DBA’]]></title><description><![CDATA[Like many in the industry, my perspective on AI tools has shifted considerably over the past year, specifically when it comes to software engineering tasks. Going from “this is nice, but doesn’t really solve complex tasks for me” to “this actually works pretty well for certain use cases.” But the more capable these tools become, the sharper one dilemma gets: you can hand off the work, but an AI agent won’t ultimately be responsible when the database goes down and your app stops working. For…]]></description><link>https://pganalyze.com/blog/the-ai-dba-dilemma</link><guid isPermaLink="false">https://pganalyze.com/blog/the-ai-dba-dilemma</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Wed, 11 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Like many in the industry, my perspective on AI tools has shifted considerably over the past year, specifically when it comes to software engineering tasks. Going from “this is nice, but doesn’t really solve complex tasks for me” to “this actually works pretty well for certain use cases.” But the more capable these tools become, the sharper one dilemma gets: you can hand off the work, but an AI agent won’t ultimately be responsible when the database goes down and your app stops working.&lt;/p&gt;
&lt;p&gt;For databases, the terms ‘AI DBA’ and ‘self-driving database’ have become marketing buzzwords with the promise of having an agent that can handle creating indexes, optimizing data models, and tuning parameter settings, leaving humans free to focus on higher-value work. The appeal is understandable. Databases are hard; Postgres can behave in odd ways; and, &lt;strong&gt;if an agent can absorb that complexity, why invest in becoming an expert yourself?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;While I’m a big believer in automating routine tasks, I worry the ‘AI DBA’ discourse is missing the mark in terms of the practical, grounded truth of how to use AI tools effectively, especially in production, and who’s responsible when incidents happen.&lt;/p&gt;
&lt;p&gt;If we let the AI do it all willy-nilly, then we accumulate cognitive debt and lose important context, making it harder to take responsibility for the outcome. But there is hope yet: And it comes in the form of enabling engineers, instead of replacing DBAs.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#how-the-ai-dba-framing-gets-it-wrong&quot;&gt;How the ‘AI DBA’ framing gets it wrong&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#what-llms-are-actually-good-at&quot;&gt;What LLMs are actually good at&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#lets-enable-engineers-and-dbas-to-own-responsibility-for-their-database&quot;&gt;Let&apos;s enable engineers and DBAs to own responsibility for their database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#looking-ahead&quot;&gt;Looking ahead&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;how-the-ai-dba-framing-gets-it-wrong&quot; &gt;&lt;a href=&quot;#how-the-ai-dba-framing-gets-it-wrong&quot; aria-label=&quot;how the ai dba framing gets it wrong permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How the ‘AI DBA’ framing gets it wrong&lt;/h2&gt;
&lt;p&gt;Framing the role of AI in databases as an ‘AI DBA’ makes a critical mistake: it conflates doing the work with owning the outcome. DevOps gave us a useful precedent here. It didn&apos;t remove responsibility from teams: it moved it closer to them. A feature isn&apos;t done when it&apos;s merged: it&apos;s done when it works in production. That same standard should apply to the database: a deployment isn&apos;t done until it performs in production. AI doesn&apos;t change that bar.&lt;/p&gt;
&lt;p&gt;Let’s imagine we have a database team today, with titles like “DBA” or “data platform engineer”:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/8711fdf26bb22c7e6ddc6941fe5ec689/1df5b/today.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Diagram showing application and data platform teams&quot;
        title=&quot;Diagram showing application and data platform teams&quot;
        src=&quot;https://pganalyze.com/static/8711fdf26bb22c7e6ddc6941fe5ec689/1d69c/today.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;And let’s say our plan here is that we can replace parts of that team with our new ‘AI DBA’ agent, that can do the work in a good enough way, and is available at all times:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/d0210406485b755e2aa56789f95da9a2/1df5b/tomorrow-with-ai-dba.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Diagram showing the data platform team replaced by AI DBAs&quot;
        title=&quot;Diagram showing the data platform team replaced by AI DBAs&quot;
        src=&quot;https://pganalyze.com/static/d0210406485b755e2aa56789f95da9a2/1d69c/tomorrow-with-ai-dba.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;But what happens in that scenario if we have the ‘AI DBA’ agent in the picture? Does it magically fix all production problems? Today it would struggle with even having production access in the first place, because giving production credentials to an autonomous AI agent does not absolve you of its decisions.&lt;/p&gt;
&lt;h2 id=&quot;what-llms-are-actually-good-at&quot; &gt;&lt;a href=&quot;#what-llms-are-actually-good-at&quot; aria-label=&quot;what llms are actually good at permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What LLMs are actually good at&lt;/h2&gt;
&lt;p&gt;Even if models improve significantly, they are still LLMs. You can&apos;t hold an agent accountable. It needs approvals for high-risk actions. Which means in any realistic scenario, responsibility falls back on either the infrastructure team or the application team — and we&apos;ve just made the handoff murkier.&lt;/p&gt;
&lt;p&gt;Worse, framing the problem as ‘nobody wants to do DBA work, so let&apos;s replace the DBA’ sends a clear message to experienced database engineers: your expertise isn&apos;t valued here. And beyond the question of accountability, it creates serious problems in practice.&lt;/p&gt;
&lt;p&gt;If we think back to why tools like Claude Code have had such tremendous success over the last year, it’s because it put engineers in the driver’s seat - and made them more effective at what they’re already doing. Quickly cross-referencing different pieces of source code, letting the LLM write code for CRUD tasks, exploring different ways of solving a problem, or investigating production incidents from different data sources effectively, whilst quickly going back to the source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What does this mean for working with Postgres databases?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Rather than replacing database experts with an AI agent, we should focus on what tasks LLMs genuinely excel at today: Information retrieval across different tools, locating the source code file that produced a query, reviewing pull requests automatically for bad patterns, and providing basic fluency for someone unfamiliar with the database, and apply that focus to enabling engineers who work with databases but whose day-to-day job isn&apos;t the database.&lt;/p&gt;
&lt;h2 id=&quot;lets-enable-engineers-and-dbas-to-own-responsibility-for-their-database&quot; &gt;&lt;a href=&quot;#lets-enable-engineers-and-dbas-to-own-responsibility-for-their-database&quot; aria-label=&quot;lets enable engineers and dbas to own responsibility for their database permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Let&apos;s enable engineers and DBAs to own responsibility for their database&lt;/h2&gt;
&lt;p&gt;The role of the DBA or data platform engineer needs to change. Successful teams already focus on enabling application engineers, instead of being gatekeepers to changes. The future is specific, purpose-built tools, owned by platform teams, made to be reliable for production use:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/d9ebc881396c99dbf13051b813a2a1a4/1df5b/tomorrow-with-enabling.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Diagram showing AI tools next to both application and data platform team members, calling out individual tools&quot;
        title=&quot;Diagram showing AI tools next to both application and data platform team members, calling out individual tools&quot;
        src=&quot;https://pganalyze.com/static/d9ebc881396c99dbf13051b813a2a1a4/1d69c/tomorrow-with-enabling.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;If we get it right, AI tools can help us collect evidence for performance optimizations, so that when the application engineer goes to the data platform team for help, they bring the information necessary to facilitate effective investigative work.&lt;/p&gt;
&lt;p&gt;AI tools can also help us bridge the gap in the other direction: data platform engineers can put on the shoes of the application engineer and become familiar with the codebase, by asking things like &quot;Where did this query get called?&quot; or &quot;Does this field get used somewhere?&quot;&lt;/p&gt;
&lt;p&gt;To enable organizations to roll out AI tools not just in development, but in production use too, we need to be clear on what is being done - and write code that abstracts production information and possibly actions in a safe way. Whether that means specific tool calls, sandboxing, or providing restricted access via a CLI, it needs to be curated to suit an organization’s environment.&lt;/p&gt;
&lt;p&gt;The data platform team should own and provide safe, reliable tools that enable engineers across the organization to use AI tools effectively with production statistics and metadata, and be responsible for their own database.&lt;/p&gt;
&lt;h2 id=&quot;looking-ahead&quot; &gt;&lt;a href=&quot;#looking-ahead&quot; aria-label=&quot;looking ahead permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Looking ahead&lt;/h2&gt;
&lt;p&gt;At pganalyze we build the best monitoring and optimization tools for Postgres, to enable both engineers and platform teams to work better together. One of the ways we do that is we make sure you have reliable monitoring data about your production system. Which query was running yesterday? What EXPLAIN plan was being used? Did the plan switch unexpectedly?&lt;/p&gt;
&lt;p&gt;And it turns out that data is pretty useful when working with AI tools. The &lt;a src=&quot;https://pganalyze.com/docs/mcp&quot;&gt;pganalyze MCP Server&lt;/a&gt;, now in early access, enables safe sharing of specific information about production databases, whilst keeping in mind specific workflows, and enabling engineers to work better.&lt;/p&gt;
&lt;p&gt;There is more to come later this year. Our aim is to focus on automating the tedious tasks, whilst staying grounded in what actually works for production systems. Sometimes it makes sense to use an AI tool, and sometimes deterministic logic is the best choice. And I’m excited to keep working with, and hearing from teams what works for them, and discover new best practices together.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;With thanks to Maciek Sakrejda, Bison Hubert and Laura Kelso for input and reviews on this article.&lt;/em&gt;&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Waiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O]]></title><description><![CDATA[With the Postgres 18 Beta 1 release this week, a multi-year effort and significant architectural shift in Postgres is taking shape: Asynchronous I/O (AIO). These capabilities are still under active development, but they represent a fundamental change in how Postgres handles I/O, offering the potential for significant performance gains, particularly in cloud environments where latency is often the bottleneck. Why asynchronous I/O matters How Postgres 17’s read streams paved the way New io_method…]]></description><link>https://pganalyze.com/blog/postgres-18-async-io</link><guid isPermaLink="false">https://pganalyze.com/blog/postgres-18-async-io</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Wed, 07 May 2025 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;With the &lt;a href=&quot;https://www.postgresql.org/about/news/postgresql-18-beta-1-released-3070/&quot;&gt;Postgres 18 Beta 1&lt;/a&gt; release this week, a multi-year effort and significant architectural shift in Postgres is taking shape: &lt;strong&gt;Asynchronous I/O (AIO)&lt;/strong&gt;. These capabilities are still under active development, but they represent a fundamental change in how Postgres handles I/O, offering the potential for significant performance gains, particularly in cloud environments where latency is often the bottleneck.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#why-asynchronous-io-matters&quot;&gt;Why asynchronous I/O matters&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#how-postgres-17s-read-streams-paved-the-way&quot;&gt;How Postgres 17’s read streams paved the way&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#new-io_method-setting-in-postgres-18&quot;&gt;New io_method setting in Postgres 18&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#io_method--sync&quot;&gt;io_method = sync&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#io_method--worker&quot;&gt;io_method = worker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#io_method--io_uring&quot;&gt;io_method = io_uring&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#asynchronous-io-in-action&quot;&gt;Asynchronous I/O in action&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#benchmark-on-aws-doubling-read-performance--even-greater-gains-from-io_uring&quot;&gt;Benchmark on AWS: Doubling read performance &amp;#x26; even greater gains from io_uring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#tuning-effective_io_concurrency&quot;&gt;Tuning effective_io_concurrency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#monitoring-ios-in-flight-with-pg_aios&quot;&gt;Monitoring I/Os in flight with pg_aios&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#heads-up-async-io-makes-io-timing-information-hard-to-interpret&quot;&gt;Heads Up: Async I/O makes I/O timing information hard to interpret&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#in-summary&quot;&gt;In summary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;While some features may still be adjusted or dropped during the beta period before the final release, now is the best time to test and validate how Postgres 18 performs in practice. In Postgres 18 AIO is limited to read operations; writes remain synchronous, though support may expand in future versions.&lt;/p&gt;
&lt;p&gt;In this post, we explain what asynchronous I/O is, how it works in Postgres 18, and what it means for performance optimization.&lt;/p&gt;
&lt;h2 id=&quot;why-asynchronous-io-matters&quot; &gt;&lt;a href=&quot;#why-asynchronous-io-matters&quot; aria-label=&quot;why asynchronous io matters permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why asynchronous I/O matters&lt;/h2&gt;
&lt;p&gt;Postgres has historically operated under a synchronous I/O model, meaning every read request is a blocking system call. The database must pause and wait for the operating system to return the data before continuing. This design introduces unnecessary waits on I/O, especially in cloud environments where storage is often network-attached (e.g. Amazon EBS) and I/O can have over 1ms of latency.&lt;/p&gt;
&lt;p&gt;In a simplified model, we can illustrate the difference like this, ignoring any prefetching/batching the Linux kernel might do:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Diagram showing synchronous vs asynchronous I/O model with concurrent requests&quot;
        title=&quot;In the asynchronous I/O model, multiple read requests can be in flight simultaneously&quot;
        src=&quot;https://pganalyze.com/static/cd0be5dde105345bb288ac73655b90f1/1d69c/sync_vs_async.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;You can picture synchronous I/O like an imaginary librarian who retrieves one book at a time, returning before fetching the next. This inefficiency compounds as the number of physical reads for a logical operation increases.&lt;/p&gt;
&lt;p&gt;Asynchronous I/O eliminates that bottleneck by allowing programs to issue multiple read requests concurrently, without waiting for prior reads to return. In an async program flow, I/O requests are scheduled to be read into a memory location and the program waits for completion of those reads, instead of issuing each read individually.&lt;/p&gt;
&lt;h3 id=&quot;how-postgres-17s-read-streams-paved-the-way&quot; &gt;&lt;a href=&quot;#how-postgres-17s-read-streams-paved-the-way&quot; aria-label=&quot;how postgres 17s read streams paved the way permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How Postgres 17’s read streams paved the way&lt;/h3&gt;
&lt;p&gt;The work for implementing asynchronous I/O in Postgres has been many years in the making. Postgres 17 introduced an essential internal abstraction, &lt;a src=&quot;https://pganalyze.com/blog/5mins-postgres-17-streaming-io&quot;&gt;with the introduction of read stream APIs&lt;/a&gt;. These internal changes standardized how read operations were issued across different subsystems and streamlined the use of &lt;code &gt;posix_fadvise()&lt;/code&gt; to request that the operating system prefetch data in advance.&lt;/p&gt;
&lt;p&gt;However, this advisory mechanism only hinted to the kernel to load data into the OS page cache, not into Postgres’ own shared buffers. Postgres still had to issue syscalls for each read, and OS readahead behaviour is not always consistent.&lt;/p&gt;
&lt;p&gt;The upcoming Postgres 18 release removes this indirection. With true asynchronous reads, data is fetched directly into shared buffers by the database itself, bypassing reliance on kernel-level heuristics and enabling more predictable, higher-throughput I/O behavior.&lt;/p&gt;
&lt;div &gt;
  &lt;a href=&quot;https://pganalyze.com/tools/postgres-performance-check-list?utm_source=blog&amp;amp;utm_medium=banner&amp;amp;utm_campaign=postgres_performance_checklist&amp;amp;utm_content=postgres-18-async-io&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;
    &lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Prevent Postgres slowdowns with this performance checklist&quot; title=&quot;Prevent Postgres slowdowns with this performance checklist&quot; src=&quot;https://pganalyze.com/static/929a7f456fb9ee2a562bd4a9d7a54f9a/1d69c/Check-list-blog-banner-ad.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
  &lt;/a&gt;
&lt;/div&gt;
&lt;h2 id=&quot;new-io_method-setting-in-postgres-18&quot; &gt;&lt;a href=&quot;#new-io_method-setting-in-postgres-18&quot; aria-label=&quot;new io_method setting in postgres 18 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;New io_method setting in Postgres 18&lt;/h2&gt;
&lt;p&gt;To control the mechanism used for asynchronous I/O, Postgres 18 introduces a new configuration parameter: &lt;code &gt;io_method&lt;/code&gt;. This setting determines how read operations are dispatched under the hood, and whether they’re handled synchronously, offloaded to I/O workers, or submitted directly to the kernel via &lt;code &gt;io_uring&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code &gt;io_method&lt;/code&gt; setting must be set in postgresql.conf and cannot be changed without restarting. It controls which  I/O implementation Postgres will use and is essential to understand when tuning I/O performance in Postgres 18. There are three possible settings for io_method, with the current default (as of Beta 1) being &lt;code &gt;worker&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;io_method--sync&quot; &gt;&lt;a href=&quot;#io_method--sync&quot; aria-label=&quot;io_method  sync permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;io_method = sync&lt;/h3&gt;
&lt;p&gt;The &lt;code &gt;sync&lt;/code&gt; setting in Postgres 18 mirrors the synchronous behavior as was implemented in Postgres 17. Reads are still synchronous and blocking, using &lt;code &gt;posix_fadvise()&lt;/code&gt; to achieve read-ahead in the Linux kernel.&lt;/p&gt;
&lt;h3 id=&quot;io_method--worker&quot; &gt;&lt;a href=&quot;#io_method--worker&quot; aria-label=&quot;io_method  worker permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;io_method = worker&lt;/h3&gt;
&lt;p&gt;The &lt;code &gt;worker&lt;/code&gt; setting utilizes dedicated &lt;strong&gt;I/O worker processes&lt;/strong&gt; running in the background that retrieve data independently of query execution. The main backend process enqueues read requests, and these workers interact with the Linux kernel to fetch data, which is then delivered into shared buffers, &lt;strong&gt;without blocking the main process&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The number of I/O workers can be configured through the new &lt;code &gt;io_workers&lt;/code&gt; setting, and defaults to &lt;code &gt;3&lt;/code&gt;. These workers are always running, and shared across all connections and databases.&lt;/p&gt;
&lt;h3 id=&quot;io_method--io_uring&quot; &gt;&lt;a href=&quot;#io_method--io_uring&quot; aria-label=&quot;io_method  io_uring permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;io_method = io_uring&lt;/h3&gt;
&lt;p&gt;This Linux-specific method uses &lt;strong&gt;&lt;code &gt;io_uring&lt;/code&gt;&lt;/strong&gt;, a high-performance I/O interface introduced in kernel version 5.1. Asynchronous I/O has been available in Linux since kernel version 2.5, but it was largely considered inefficient and hard to use. &lt;code &gt;io_uring&lt;/code&gt; establishes a &lt;strong&gt;shared ring buffer&lt;/strong&gt; between Postgres and the kernel, minimizing syscall overhead. This is the most efficient option, &lt;strong&gt;eliminating the need for I/O worker processes entirely&lt;/strong&gt;, but is only available on newer Linux kernels and requires file systems and configurations compatible with &lt;code &gt;io_uring&lt;/code&gt; support.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Important note:&lt;/strong&gt; As of the Postgres 18 Beta 1, asynchronous I/O is supported for sequential scans, bitmap heap scans, and maintenance operations like &lt;code &gt;VACUUM&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&quot;asynchronous-io-in-action&quot; &gt;&lt;a href=&quot;#asynchronous-io-in-action&quot; aria-label=&quot;asynchronous io in action permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Asynchronous I/O in action&lt;/h2&gt;
&lt;p&gt;Asynchronous I/O delivers the most noticeable gains in cloud environments where storage is network-attached, such as Amazon EBS volumes. In these setups, individual disk reads often take multiple milliseconds, introducing substantial latency compared to local SSDs.&lt;/p&gt;
&lt;p&gt;With traditional synchronous I/O, each of these reads blocks query execution until the data arrives, leading to idle CPU time and degraded throughput. By contrast, asynchronous I/O allows Postgres to issue multiple read requests in parallel and continue processing while waiting for results. This reduces query latency and enables much more efficient use of available I/O bandwidth and CPU cycles.&lt;/p&gt;
&lt;h3 id=&quot;benchmark-on-aws-doubling-read-performance--even-greater-gains-from-io_uring&quot; &gt;&lt;a href=&quot;#benchmark-on-aws-doubling-read-performance--even-greater-gains-from-io_uring&quot; aria-label=&quot;benchmark on aws doubling read performance  even greater gains from io_uring permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Benchmark on AWS: Doubling read performance &amp;#x26; even greater gains from io_uring&lt;/h3&gt;
&lt;p&gt;To evaluate the performance impact of asynchronous I/O, we benchmarked a representative workload on AWS, comparing Postgres 17 with Postgres 18 using different &lt;code &gt;io_method&lt;/code&gt; settings. The workload remained identical across versions, allowing us to isolate the effects of the new I/O infrastructure.&lt;/p&gt;
&lt;p&gt;We&apos;ve tested on an AWS c7i.8xlarge instance (32 vCPUs, 64 GB RAM), with a dedicated 100GB &lt;code &gt;io2&lt;/code&gt; EBS volume for Postgres, with 20,000 provisioned IOPS. The test table was 3.5GB in size:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; test&lt;span &gt;(&lt;/span&gt;id &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; test &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; generate_series&lt;span &gt;(&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;100000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# \dt+
                                   List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method |  Size   | Description 
--------+------+-------+----------+-------------+---------------+---------+-------------
 public | test | table | postgres | permanent   | heap          | 3458 MB | 
(1 row)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Between test runs we cleared the OS page cache (&lt;code &gt;sync; echo 3 &gt; /proc/sys/vm/drop_caches&lt;/code&gt;), and restarted Postgres, to gather cold cache results. Warm cache results represent running the query a second time. We repeated the complete test run for each configuration multiple times, retaining the best result out of three.&lt;/p&gt;
&lt;p&gt;Whilst we also tested with parallel query, to keep results easier to understand all results below are with parallel query turned off (&lt;code &gt;max_parallel_workers_per_gather = 0&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cold cache results:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Postgres 17, using synchronous I/O, established the baseline. It showed consistent read latency, but throughput was limited by the need to complete each I/O request before issuing the next:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# SELECT COUNT(*) FROM test;
   count   
-----------
 100000001
(1 row)

Time: 15830.880 ms (00:15.831)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Postgres 18, when configured with &lt;code &gt;io_method = sync&lt;/code&gt;, performed nearly identically, confirming that behavior remains unchanged without enabling asynchronous I/O:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# SELECT COUNT(*) FROM test;
   count   
-----------
 100000001
(1 row)

Time: 15071.089 ms (00:15.071)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;However, when we switch to using the &lt;code &gt;worker&lt;/code&gt; method, with 3 I/O workers (the default) a clear improvement shows:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# SELECT COUNT(*) FROM test;
   count   
-----------
 100000001
(1 row)

Time: 10051.975 ms (00:10.052)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We observed some gains by raising the number of I/O workers, but the biggested improvement comes when utilizing &lt;code &gt;io_uring&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# SELECT COUNT(*) FROM test;
   count   
-----------
 100000001
(1 row)

Time: 5723.423 ms (00:05.723)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When we graph this (measuring runtime in ms, lower is better), it’s clear that Postgres 18 performs significantly better in cold cache situations:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Read performance comparison between Postgres 17 and 18 with different io_method settings&quot;
        title=&quot;Read performance comparison between Postgres 17 and 18 with different io_method settings&quot;
        src=&quot;https://pganalyze.com/static/506febf39b7d14c7ba413260d30b63cc/1d69c/runtime-compared.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;For cold cache tests, both &lt;code &gt;worker&lt;/code&gt; and &lt;code &gt;io_uring&lt;/code&gt; delivered a consistent &lt;strong&gt;2-3x improvement&lt;/strong&gt; in read performance compared to the legacy &lt;code &gt;sync&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;Whilst &lt;code &gt;worker&lt;/code&gt; offers a slight benefit for warm cache tests due to its parallelism, &lt;code &gt;io_uring&lt;/code&gt; consistently performed better in cold cache tests, and its lower syscall overhead and reduced process coordination would make &lt;strong&gt;&lt;code &gt;io_uring&lt;/code&gt; the recommended setting&lt;/strong&gt; for maximizing I/O performance in Postgres 18.&lt;/p&gt;
&lt;p&gt;This performance shift for disk reads has meaningful implications for infrastructure planning, especially in cloud environments. By reducing I/O wait time, asynchronous reads can substantially increase query throughput, reduce latency and CPU overhead. For read-heavy workloads, this may translate into smaller instance sizes or better utilization of existing resources.&lt;/p&gt;
&lt;h3 id=&quot;tuning-effective_io_concurrency&quot; &gt;&lt;a href=&quot;#tuning-effective_io_concurrency&quot; aria-label=&quot;tuning effective_io_concurrency permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Tuning effective_io_concurrency&lt;/h3&gt;
&lt;p&gt;In Postgres 18, &lt;code &gt;effective_io_concurrency&lt;/code&gt; becomes more interesting, but only when used with an asynchronous &lt;code &gt;io_method&lt;/code&gt; such as &lt;code &gt;worker&lt;/code&gt; or &lt;code &gt;io_uring&lt;/code&gt;. Previously, this setting merely advised the OS to prefetch data using &lt;code &gt;posix_fadvise&lt;/code&gt;. Now, it directly controls how many asynchronous read-ahead requests Postgres issues internally.&lt;/p&gt;
&lt;p&gt;The number of blocks read ahead is influenced by both &lt;code &gt;effective_io_concurrency&lt;/code&gt; and &lt;code &gt;io_combine_limit&lt;/code&gt;, following the general formula:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;maximum read-ahead = effective_io_concurrency × io_combine_limit&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This gives DBAs and engineers greater control over I/O behavior. The optimal value requires benchmarking, as it depends on your I/O subsystem. For example, higher values may benefit cloud environments with high latency that also support high concurrency, like AWS EBS with high provisioned IOPS.&lt;/p&gt;
&lt;p&gt;When doing our benchmarks, we also tested higher &lt;code &gt;effective_io_concurrency&lt;/code&gt; (between 16 and 128) but did not see a meaningful difference. However, that is likely due to the simple test query used.&lt;/p&gt;
&lt;p&gt;It’s worth noting that the previous default of effective_io_concurrency was 1 in Postgres 17, which is now raised to 16, &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ff79b5b2ab&quot;&gt;based on benchmarks done by the Postgres community&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;monitoring-ios-in-flight-with-pg_aios&quot; &gt;&lt;a href=&quot;#monitoring-ios-in-flight-with-pg_aios&quot; aria-label=&quot;monitoring ios in flight with pg_aios permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Monitoring I/Os in flight with pg_aios&lt;/h3&gt;
&lt;p&gt;As mentioned, previous versions of Postgres with synchronous I/O made it easy to spot read delays: the backend process would block while waiting for disk access, and monitoring tools like pganalyze can reliably surface &lt;code &gt;IO / DataFileRead&lt;/code&gt; as a wait event during these stalls.&lt;/p&gt;
&lt;p&gt;For example, here we can see wait events clearly in Postgres 17 synchronous I/O.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Screenshot of pganalyze showing wait events in Postgres 17&quot;
        title=&quot;pganalyze interface showing clear IO / DataFileRead wait events in Postgres 17&quot;
        src=&quot;https://pganalyze.com/static/67303bca18e1ab006c16c26979172b33/1d69c/wait_events_io_read.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;With asynchronous I/O in Postgres 18, backend wait behavior changes. When using &lt;code &gt;io_method = worker&lt;/code&gt;, the backend process delegates reads to a separate I/O worker. As a result, the backend may appear idle or show the new &lt;code &gt;IO / AioIoCompletion&lt;/code&gt; wait event, while the I/O worker shows the actual I/O wait events:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; backend_type&lt;span &gt;,&lt;/span&gt; query&lt;span &gt;,&lt;/span&gt; state&lt;span &gt;,&lt;/span&gt; wait_event_type&lt;span &gt;,&lt;/span&gt; wait_event
  &lt;span &gt;FROM&lt;/span&gt; pg_stat_activity
 &lt;span &gt;WHERE&lt;/span&gt; backend_type &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;client backend&apos;&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; backend_type &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;io worker&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;  backend_type  | state  | wait_event_type |   wait_event    
----------------+--------+-----------------+-----------------
 client backend | active | IO              | AioIoCompletion
 io worker      |        | IO              | DataFileRead
 io worker      |        | IO              | DataFileRead
 io worker      |        | IO              | DataFileRead
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With &lt;code &gt;io_method = io_uring&lt;/code&gt;, read operations are submitted directly to the kernel and completed asynchronously. The backend does not block on a traditional I/O syscall, so this activity is not visible from the Postgres side, even though I/O is in progress.&lt;/p&gt;
&lt;p&gt;To help with debugging of I/O requests in flight, the new &lt;code &gt;pg_aios&lt;/code&gt; view can show Postgres internal state, even when using &lt;code &gt;io_uring&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_aios&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;  pid  | io_id | io_generation |    state     | operation |    off    | length | target | handle_data_len | raw_result | result  |                   target_desc                    | f_sync | f_localmem | f_buffered 
-------+-------+---------------+--------------+-----------+-----------+--------+--------+-----------------+------------+---------+--------------------------------------------------+--------+------------+------------
 91452 |     1 |          4781 | SUBMITTED    | read      | 996278272 | 131072 | smgr   |              16 |            | UNKNOWN | blocks 383760..383775 in file &quot;base/16384/16389&quot; | f      | f          | t
 91452 |     2 |          4785 | SUBMITTED    | read      | 996147200 | 131072 | smgr   |              16 |            | UNKNOWN | blocks 383744..383759 in file &quot;base/16384/16389&quot; | f      | f          | t
 91452 |     3 |          4796 | SUBMITTED    | read      | 996409344 | 131072 | smgr   |              16 |            | UNKNOWN | blocks 383776..383791 in file &quot;base/16384/16389&quot; | f      | f          | t
 91452 |     4 |          4802 | SUBMITTED    | read      | 996016128 | 131072 | smgr   |              16 |            | UNKNOWN | blocks 383728..383743 in file &quot;base/16384/16389&quot; | f      | f          | t
 91452 |     5 |          3175 | COMPLETED_IO | read      | 995885056 | 131072 | smgr   |              16 |     131072 | UNKNOWN | blocks 383712..383727 in file &quot;base/16384/16389&quot; | f      | f          | t
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Understanding these behavior changes and understanding the impact of asynchronous execution is essential when optimizing I/O performance in Postgres 18.&lt;/p&gt;
&lt;h2 id=&quot;heads-up-async-io-makes-io-timing-information-hard-to-interpret&quot; &gt;&lt;a href=&quot;#heads-up-async-io-makes-io-timing-information-hard-to-interpret&quot; aria-label=&quot;heads up async io makes io timing information hard to interpret permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Heads Up: Async I/O makes I/O timing information hard to interpret&lt;/h2&gt;
&lt;p&gt;Asynchronous I/O introduces a shift in how execution timing is reported. When the backend no longer blocks directly on disk reads (as is the case with &lt;code &gt;worker&lt;/code&gt; or &lt;code &gt;io_uring&lt;/code&gt;) the complete time spent doing I/O may not be reflected in &lt;code &gt;EXPLAIN ANALYZE&lt;/code&gt; output. This can make I/O-bound queries seem to require less I/O effort than previously.&lt;/p&gt;
&lt;p&gt;First, let&apos;s run the earlier query in &lt;code &gt;EXPLAIN ANALYZE&lt;/code&gt; on a cold cache in Postgres 17:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# EXPLAIN (ANALYZE, BUFFERS, TIMING OFF) SELECT COUNT(*) FROM test;
                                               QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1692478.40..1692478.41 rows=1 width=8) (actual rows=1 loops=1)
   Buffers: shared read=442478
   I/O Timings: shared read=14779.316
   -&gt;  Seq Scan on test  (cost=0.00..1442478.32 rows=100000032 width=0) (actual rows=100000001 loops=1)
         Buffers: shared read=442478
         I/O Timings: shared read=14779.316
 Planning:
   Buffers: shared hit=13 read=6
   I/O Timings: shared read=3.182
 Planning Time: 8.136 ms
 Execution Time: 18006.405 ms
(11 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We&apos;ve read 442,478 buffers in 14.8 seconds.&lt;/p&gt;
&lt;p&gt;And now, we repeat the test on Postgres 18 with the default settings (&lt;code &gt;io_method = worker&lt;/code&gt;):&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;test=# EXPLAIN (ANALYZE, BUFFERS, TIMING OFF) SELECT COUNT(*) FROM test;
                                                QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1692478.40..1692478.41 rows=1 width=8) (actual rows=1.00 loops=1)
   Buffers: shared read=442478
   I/O Timings: shared read=7218.835
   -&gt;  Seq Scan on test  (cost=0.00..1442478.32 rows=100000032 width=0) (actual rows=100000001.00 loops=1)
         Buffers: shared read=442478
         I/O Timings: shared read=7218.835
 Planning:
   Buffers: shared hit=13 read=6
   I/O Timings: shared read=2.709
 Planning Time: 2.925 ms
 Execution Time: 10480.827 ms
(11 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We&apos;ve read 442,478 buffers in 7.2 seconds.&lt;/p&gt;
&lt;p&gt;Whilst with parallel query we get a summary of all the I/O time across all parallel workers, no such summarization occurs with I/O workers. What we are seeing is the wait time for the I/O to be completed, ignoring any parallelism that may happen behind the scenes.&lt;/p&gt;
&lt;p&gt;This is technically not a behaviour change, since even in Postgres 17 the time reported was the time spent waiting on I/Os, not the time spent performing the I/O, e.g. Kernel I/O time for readahead was never accounted for.&lt;/p&gt;
&lt;p&gt;Historically I/O timing was often equated with I/O effort, instead of just looking at shared buffer read counts, in order to distinguish from a OS page cache hit. Now, in Postgres 18, interpreting I/O timing requires more caution: asynchronous I/O can hide I/O overhead in query plans.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;To summarize, the upcoming release of Postgres 18 marks the beginning of a major evolution in how I/O is handled. While currently limited to reads, asynchronous I/O already opens the door to significant performance improvements in high-latency cloud environments.&lt;/p&gt;
&lt;p&gt;But some of these gains come with tradeoffs. Engineering teams will need to adjust their observability practices, learn new semantics for timing and wait events, and perhaps revisit tuning parameters with previously limited impact, like &lt;code &gt;effective_io_conurrency&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;in-summary&quot; &gt;&lt;a href=&quot;#in-summary&quot; aria-label=&quot;in summary permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;In summary&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Asynchronous I/O support in Postgres 18 introduces &lt;code &gt;worker&lt;/code&gt; (as the default) and &lt;code &gt;io_uring&lt;/code&gt; options under the new &lt;code &gt;io_method&lt;/code&gt; setting.&lt;/li&gt;
&lt;li&gt;Benchmarks show up to a 2-3x throughput improvement for read-heavy workloads in cloud environments.&lt;/li&gt;
&lt;li&gt;Observability practices need to evolve: &lt;code &gt;EXPLAIN ANALYZE&lt;/code&gt; may underreport I/O effort, and new views like &lt;code &gt;pg_aios&lt;/code&gt; will help provide insights.&lt;/li&gt;
&lt;li&gt;Tools like pganalyze will be adapting to these changes to continue surfacing relevant performance insights.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As Postgres development continues, future versions (19 and beyond) may bring asynchronous write support, further reducing I/O bottlenecks in modern workloads, and enabling production use of Direct I/O.&lt;/p&gt;
&lt;h3 id=&quot;references&quot; &gt;&lt;a href=&quot;#references&quot; aria-label=&quot;references permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;References&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/devel/runtime-config-resource.html#GUC-IO-METHOD&quot;&gt;PostgreSQL &lt;code &gt;io_method&lt;/code&gt; GUC (Postgres 18)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY&quot;&gt;PostgreSQL &lt;code &gt;effective_io_concurrency&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/storage-buffer.html&quot;&gt;PostgreSQL Shared Buffers and Buffer Management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW&quot;&gt;&lt;code &gt;pg_stat_activity&lt;/code&gt; View&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/devel/monitoring-stats.html#PG-STAT-IO-VIEW&quot;&gt;&lt;code &gt;pg_stat_io&lt;/code&gt; View&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/devel/monitoring-stats.html#PG-AIOS-VIEW&quot;&gt;&lt;code &gt;pg_aios&lt;/code&gt; View (New in Postgres 18)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://man7.org/linux/man-pages/man2/posix_fadvise.2.html&quot;&gt;&lt;code &gt;posix_fadvise()&lt;/code&gt; System Call&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.google.com/url?q=https://www.man7.org/linux/man-pages/man7/io_uring.7.html&amp;#x26;sa=D&amp;#x26;source=docs&amp;#x26;ust=1746206271490972&amp;#x26;usg=AOvVaw1B_RmjsiRaB-HDroNJCv6b&quot;&gt;Linux io_uring Man Page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-17-streaming-io&quot;&gt;5mins of Postgres: Waiting for Postgres 17: Streaming I/O for sequential scans &amp;#x26; ANALYZE&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Postgres vs. SQL Server: B-Tree Index Differences & the Benefit of Deduplication]]></title><description><![CDATA[When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways. In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they…]]></description><link>https://pganalyze.com/blog/postgresql-vs-sql-server-btree-index-deduplication</link><guid isPermaLink="false">https://pganalyze.com/blog/postgresql-vs-sql-server-btree-index-deduplication</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 03 Apr 2025 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways.&lt;/p&gt;
&lt;p&gt;In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they store and access data on disk. We&apos;ll also benchmark the impact of deduplication of values on index size in each database system.&lt;/p&gt;
&lt;p&gt;We&apos;ve also included a comprehensive reference guide at the end (see &lt;a href=&quot;#comparison-table-postgresql-vs-sql-server-indexing&quot;&gt;Postgres vs. SQL Server Index Comparison Table&lt;/a&gt;). Whether you&apos;re optimizing queries or planning a migration, these differences can have a meaningful impact on both performance and indexing strategy.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#how-b-tree-indexing-works-in-postgresql-vs-sql-server&quot;&gt;How B-Tree indexing works in PostgreSQL vs. SQL Server&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#postgresqls-b-tree-deduplication&quot;&gt;PostgreSQL&apos;s B-Tree deduplication&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#benchmarking-b-tree-indexes-on-postgresql-vs-sql-server&quot;&gt;Benchmarking B-Tree indexes on PostgreSQL vs. SQL Server&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#postgresql-test-setup&quot;&gt;PostgreSQL Test Setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#sql-server-test-setup&quot;&gt;SQL Server Test Setup&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#benchmark-results-postgresqls-deduplication-reduces-index-size&quot;&gt;Benchmark results: PostgreSQL&apos;s deduplication reduces index size&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#comparison-table-postgresql-vs-sql-server-indexing&quot;&gt;Comparison Table: PostgreSQL vs. SQL Server Indexing&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#choosing-the-right-index-for-your-workload&quot;&gt;Choosing the right index for your workload&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#references&quot;&gt;References:&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;how-b-tree-indexing-works-in-postgresql-vs-sql-server&quot; &gt;&lt;a href=&quot;#how-b-tree-indexing-works-in-postgresql-vs-sql-server&quot; aria-label=&quot;how b tree indexing works in postgresql vs sql server permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How B-Tree indexing works in PostgreSQL vs. SQL Server&lt;/h2&gt;
&lt;p&gt;At a high level, both databases use B-Tree indexes to speed up equality and range queries. B-Trees maintain sorted order and are balanced for consistent read performance. But while the concept is similar in both databases, the way it&apos;s implemented has important performance consequences.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;SQL Server: Clustered vs Nonclustered Index&quot;
        title=&quot;SQL Server: Clustered vs Nonclustered Index&quot;
        src=&quot;https://pganalyze.com/static/4ee84e2e9209e2207b8ca662ad406f03/1d69c/sql_server_index_types.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;SQL Server uses clustered indexes to physically order the table&apos;s data by the indexed column. When a clustered index is defined, the rows in the table are stored in the same order as the index itself. Nonclustered indexes are stored separately and point to rows using a row locator, either a RID or the clustered key. This physical ordering can be beneficial for range scans or pagination queries, but it also means you&apos;re limited to one clustered index per table. More importantly, SQL Server stores each index entry in full, even if multiple entries have identical values on the same page. There&apos;s no deduplication, so indexes with many repeated values can grow large and consume excessive I/O.&lt;/p&gt;
&lt;div &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Postgres B-Tree Index&quot; title=&quot;Postgres B-Tree Index&quot; src=&quot;https://pganalyze.com/static/cea9823e4460df530b3a23e0787ba953/e9beb/btree_index_postgres.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;PostgreSQL does not have clustered indexes in the SQL Server sense. All PostgreSQL tables are stored as unordered heaps, and indexes are purely logical structures that point to tuples in the heap. This design gives PostgreSQL some flexibility: it allows for easier index maintenance and avoids the complications of physical reordering.&lt;/p&gt;
&lt;p&gt;However, it also means that you can&apos;t rely on an index to define how the table is physically laid out. If query performance depends on reading data in a particular order, Postgres does allow you to run the &lt;code &gt;CLUSTER&lt;/code&gt; command, but it requires a full table lock. In production environments, you can use tools like &lt;code &gt;pg_repack&lt;/code&gt; to achieve a similar result.&lt;/p&gt;
&lt;p&gt;So while both databases use B-Tree indexes as their default, SQL Server&apos;s tight coupling between index and physical storage creates a different set of expectations and limitations. PostgreSQL&apos;s index model has some performance downsides (since there is no clustered index implementation), but distinct features like deduplication make it perform better in other situations.&lt;/p&gt;
&lt;h3 id=&quot;postgresqls-b-tree-deduplication&quot; &gt;&lt;a href=&quot;#postgresqls-b-tree-deduplication&quot; aria-label=&quot;postgresqls b tree deduplication permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;PostgreSQL&apos;s B-Tree deduplication&lt;/h3&gt;
&lt;p&gt;Deduplication was introduced in PostgreSQL version 13 and addresses a common inefficiency in traditional B-Tree indexes. When many rows share the same indexed value—think status codes, boolean flags, or timestamps—standard B-Trees store each value and its corresponding tuple pointer individually. This results in bloated index pages and increased maintenance cost, especially for write-heavy workloads.&lt;/p&gt;
&lt;p&gt;PostgreSQL deduplicates repeated values within a single index page by default. Instead of storing the same key value multiple times, it stores it once and maintains a compact structure that tracks all matching heap pointers. This reduces index size significantly and improves cache performance, since more index entries fit in memory.&lt;/p&gt;
&lt;p&gt;SQL Server does not support deduplication. Each index entry is stored independently, even if the values are identical. In datasets with skewed distributions, PostgreSQL&apos;s approach leads to more compact, more efficient indexes, with fewer pages and less disk I/O.&lt;/p&gt;
&lt;h3 id=&quot;benchmarking-b-tree-indexes-on-postgresql-vs-sql-server&quot; &gt;&lt;a href=&quot;#benchmarking-b-tree-indexes-on-postgresql-vs-sql-server&quot; aria-label=&quot;benchmarking b tree indexes on postgresql vs sql server permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Benchmarking B-Tree indexes on PostgreSQL vs. SQL Server&lt;/h3&gt;
&lt;p&gt;To understand how PostgreSQL&apos;s index deduplication affects real-world performance and storage, we ran a benchmark comparing B-Tree index sizes across PostgreSQL and SQL Server under varying levels of data duplication. Each test created a table of 10 million rows with differing levels of value repetition, ranging from entirely unique values to repeated values at a 1000x factor.&lt;/p&gt;
&lt;p&gt;Here&apos;s how we structured the test in both databases, so you can reproduce it yourself.&lt;/p&gt;
&lt;h4 id=&quot;postgresql-test-setup&quot; &gt;&lt;a href=&quot;#postgresql-test-setup&quot; aria-label=&quot;postgresql test setup permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;PostgreSQL Test Setup&lt;/h4&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_1&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_10&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_100&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_1000&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_1 &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_10 &lt;span &gt;SELECT&lt;/span&gt; val &lt;span &gt;/&lt;/span&gt; &lt;span &gt;10&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; x&lt;span &gt;(&lt;/span&gt;val&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_100 &lt;span &gt;SELECT&lt;/span&gt; val &lt;span &gt;/&lt;/span&gt; &lt;span &gt;100&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; x&lt;span &gt;(&lt;/span&gt;val&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_1000 &lt;span &gt;SELECT&lt;/span&gt; val &lt;span &gt;/&lt;/span&gt; &lt;span &gt;1000&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; x&lt;span &gt;(&lt;/span&gt;val&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1_idx &lt;span &gt;ON&lt;/span&gt; factor_1&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_10_idx &lt;span &gt;ON&lt;/span&gt; factor_10&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_100_idx &lt;span &gt;ON&lt;/span&gt; factor_100&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1000_idx &lt;span &gt;ON&lt;/span&gt; factor_1000&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1_idx_no_dup_fill100 &lt;span &gt;ON&lt;/span&gt; factor_1&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WITH&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;deduplicate_items &lt;span &gt;=&lt;/span&gt; &lt;span &gt;off&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;fillfactor&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;100&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_10_idx_no_dup_fill100 &lt;span &gt;ON&lt;/span&gt; factor_10&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WITH&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;deduplicate_items &lt;span &gt;=&lt;/span&gt; &lt;span &gt;off&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;fillfactor&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;100&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_100_idx_no_dup_fill100 &lt;span &gt;ON&lt;/span&gt; factor_100&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WITH&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;deduplicate_items &lt;span &gt;=&lt;/span&gt; &lt;span &gt;off&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;fillfactor&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;100&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1000_idx_no_dup_fill100 &lt;span &gt;ON&lt;/span&gt; factor_1000&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WITH&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;deduplicate_items &lt;span &gt;=&lt;/span&gt; &lt;span &gt;off&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;fillfactor&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;100&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id=&quot;sql-server-test-setup&quot; &gt;&lt;a href=&quot;#sql-server-test-setup&quot; aria-label=&quot;sql server test setup permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;SQL Server Test Setup&lt;/h4&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_1&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_10&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_100&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; factor_1000&lt;span &gt;(&lt;/span&gt;col &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_1 &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_10 &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;value&lt;/span&gt; &lt;span &gt;/&lt;/span&gt; &lt;span &gt;10&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_100 &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;value&lt;/span&gt; &lt;span &gt;/&lt;/span&gt; &lt;span &gt;100&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; factor_1000 &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;value&lt;/span&gt; &lt;span &gt;/&lt;/span&gt; &lt;span &gt;1000&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; GENERATE_SERIES&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;10000000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1_idx &lt;span &gt;ON&lt;/span&gt; factor_1&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_10_idx &lt;span &gt;ON&lt;/span&gt; factor_10&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_100_idx &lt;span &gt;ON&lt;/span&gt; factor_100&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; factor_1000_idx &lt;span &gt;ON&lt;/span&gt; factor_1000&lt;span &gt;(&lt;/span&gt;col&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;benchmark-results-postgresqls-deduplication-reduces-index-size&quot; &gt;&lt;a href=&quot;#benchmark-results-postgresqls-deduplication-reduces-index-size&quot; aria-label=&quot;benchmark results postgresqls deduplication reduces index size permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Benchmark results: PostgreSQL&apos;s deduplication reduces index size&lt;/h3&gt;
&lt;p&gt;When we benchmarked index sizes across PostgreSQL and SQL Server, we saw a sharp divergence as data duplication increased. With values repeated 1,000 times, a PostgreSQL index using deduplication was &lt;strong&gt;3x smaller&lt;/strong&gt; than the same index created with deduplication turned off. Compared to SQL Server, which does not support deduplication and stores each repeated value in full, PostgreSQL consistently produced smaller, more efficient indexes.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;pganalyze-sql-server-postgresql-btree-index-size-benchmark.png&quot;
        title=&quot;pganalyze-sql-server-postgresql-btree-index-size-benchmark.png&quot;
        src=&quot;https://pganalyze.com/static/023f81d98fc55b7c3a409be0f9ca868e/1d69c/pganalyze-sql-server-postgresql-btree-index-size-benchmark.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;This difference matters. High-cardinality columns like status flags, timestamps, and categorical fields are common in production systems. When these values repeat across millions of rows, large indexes can quickly become a performance bottleneck, slowing scans, increasing I/O, and inflating memory usage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PostgreSQL&apos;s deduplication reduces index size significantly&lt;/strong&gt;, making it easier to keep indexes in memory and reduce disk pressure. For teams moving from SQL Server to PostgreSQL, or simply scaling out workloads with heavily used indexes, this optimization isn&apos;t just theoretical. It has a direct impact on resource usage, query performance, and overall operational efficiency.&lt;/p&gt;
&lt;h2 id=&quot;comparison-table-postgresql-vs-sql-server-indexing&quot; &gt;&lt;a href=&quot;#comparison-table-postgresql-vs-sql-server-indexing&quot; aria-label=&quot;comparison table postgresql vs sql server indexing permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Comparison Table: PostgreSQL vs. SQL Server Indexing&lt;/h2&gt;
&lt;p&gt;Index implementations for both B-Tree and other index types vary significantly between PostgreSQL and SQL Server. We&apos;ve put together a comprehensive index comparison table to help you as a reference in your SQL Server to PostgreSQL migrations.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(Certain index types exist in SQL Server but not in PostgreSQL or vice versa. We&apos;ve noted supportability as follows: 🟢 Supported index type  🔴 Not supported index type.)&lt;/em&gt;&lt;/p&gt;
&lt;table &gt;
  &lt;thead&gt;
    &lt;tr &gt;
      &lt;th &gt;Index Type&lt;/th&gt;
      &lt;th &gt;Use Case Example&lt;/th&gt;
      &lt;th &gt;PostgreSQL&lt;/th&gt;
      &lt;th &gt;SQL Server&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;B-Tree&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Best for general-purpose indexing, equality and range queries (e.g., filtering users by age or date).&lt;/td&gt;
      &lt;td &gt;🟢 Default index type, supports equality &amp; range queries, sorting, and pattern matching with prefixes.&lt;/td&gt;
      &lt;td &gt;🟢 On SQL Server the default structure for clustered and nonclustered indexes is a B-Tree.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;Clustered&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Automatically orders table rows by the index key; best for frequently sorted queries.&lt;/td&gt;
      &lt;td &gt;🔴 PostgreSQL does not have clustered indexes; instead, you can use the &lt;code&gt;CLUSTER&lt;/code&gt; command to order the table based on a nonclustered index; however, this order will not be preserved as new data gets inserted.&lt;/td&gt;
      &lt;td &gt;🟢 Equivalent to PostgreSQL B-Tree; sorts &amp; stores data in order based on key.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;Nonclustered&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Useful for indexes that speed up searches without affecting physical storage order.&lt;/td&gt;
      &lt;td &gt;🟢 In PostgreSQL all indexes are nonclustered.&lt;/td&gt;
      &lt;td &gt;🟢 Can be created on heap or a clustered index; stores data separately from the table.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;Hash&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Optimized for exact match lookups, like searching by user ID or email address.&lt;/td&gt;
      &lt;td &gt;🟢 In PostgreSQL, hash indexes can only index a single column. While you can create multiple indexes to support a query, typically a multi-column B-Tree index is more effective.&lt;/td&gt;
      &lt;td &gt;🟢 Used for memory-optimized tables; requires a fixed bucket count.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;Filtered / Partial&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Efficient for indexing a subset of data, such as active users only.&lt;/td&gt;
      &lt;td &gt;🟢 PostgreSQL can use Partial Indexes to index only a subset of rows.&lt;/td&gt;
      &lt;td &gt;🟢 A Filtered Index is a nonclustered index that indexes only a subset of table rows.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;BRIN&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Best for very large tables where data is naturally ordered, such as time-series data.&lt;/td&gt;
      &lt;td &gt;🟢 Stores summaries of block ranges; best for large, sequentially stored data.&lt;/td&gt;
      &lt;td &gt;🔴 N/A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;Full-text&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Used for natural language searches, such as searching text in articles or product reviews.&lt;/td&gt;
      &lt;td &gt;🟢 PostgreSQL supports Full-Text Search using GIN indexes on &lt;code&gt;tsvector&lt;/code&gt; columns.&lt;/td&gt;
      &lt;td &gt;🟢 SQL Server uses an inverted index for text-based queries, similar to PostgreSQL GIN.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;GIN&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Great for indexing JSONB, arrays, and full-text search (e.g., searching product descriptions).&lt;/td&gt;
      &lt;td &gt;🟢 Inverted index; best for JSON, full-text search, and arrays.&lt;/td&gt;
      &lt;td &gt;🔴 Partial capability via Full-text index.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;Vector&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Efficiently perform similarity search or nearest neighbor search across high-dimensional data, most commonly in AI and machine learning applications.&lt;/td&gt;
      &lt;td &gt;🟢 PostgreSQL doesn&apos;t include vector support natively, but the open-source extension &lt;a href=&quot;https://github.com/pgvector/pgvector&quot;&gt;pgvector&lt;/a&gt; enables vector storage and indexing.&lt;/td&gt;
      &lt;td &gt;🔴 SQL Server does not natively support vector indexing or search. Microsoft recommends using its Azure AI Search instead.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;XML&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Optimized for querying and storing XML documents.&lt;/td&gt;
      &lt;td &gt;🔴 PostgreSQL does not support indexes directly on XML types; however, expression indexes can be used on subsets of the XML data. For unstructured documents, JSONB is the recommended data type.&lt;/td&gt;
      &lt;td &gt;🟢 SQL Server has dedicated indexes on XML data types.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;Spatial&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Used for geographic queries, e.g., finding locations within a radius.&lt;/td&gt;
      &lt;td &gt;🟢 In PostgreSQL spatial indexing queries are provided by the open source &lt;a href=&quot;https://postgis.net/&quot;&gt;PostGIS&lt;/a&gt; extension.&lt;/td&gt;
      &lt;td &gt;🟢 SQL Server has built in spatial data types.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;SP-GiST&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Used for hierarchical data structures like tree-based searches (e.g., routing networks).&lt;/td&gt;
      &lt;td &gt;🟢 Supports non-balanced tree structures like quadtrees &amp; k-d trees, good for hierarchical data.&lt;/td&gt;
      &lt;td &gt;🔴 N/A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td &gt;&lt;strong&gt;GiST&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Ideal for geometric and full-text search queries, e.g., finding nearby locations.&lt;/td&gt;
      &lt;td &gt;🟢 Infrastructure for specialized indexes; used for geometric &amp; full-text search.&lt;/td&gt;
      &lt;td &gt;🔴 N/A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr &gt;
      &lt;td &gt;&lt;strong&gt;Columnstore&lt;/strong&gt;&lt;/td&gt;
      &lt;td &gt;Best for OLAP workloads and analytical queries (e.g., data warehousing).&lt;/td&gt;
      &lt;td &gt;🔴 While PostgreSQL has different extensions that offer columnar storage, like Citus and Timescale, it&apos;s a relatively recent implementation and may be limited by use case.&lt;/td&gt;
      &lt;td &gt;🟢 SQL Server has built-in columnar storage implemented as an index type since SQL Server 2012.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&quot;choosing-the-right-index-for-your-workload&quot; &gt;&lt;a href=&quot;#choosing-the-right-index-for-your-workload&quot; aria-label=&quot;choosing the right index for your workload permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Choosing the right index for your workload&lt;/h2&gt;
&lt;p&gt;Understanding the differences between PostgreSQL and SQL Server indexing is crucial when optimizing query performance, planning a migration, or designing a high-performance database. Choosing the right indexing strategy requires deep knowledge of query execution patterns and performance trade-offs. Many teams manually experiment with different indexing strategies, which can lead to over-indexing, redundant indexes, or missed optimization opportunities.&lt;/p&gt;
&lt;p&gt;Instead of trial and error, &lt;a href=&quot;https://pganalyze.com/blog/index-advisor-v3&quot;&gt;&lt;strong&gt;pganalyze Index Advisor&lt;/strong&gt;&lt;/a&gt; automatically detects missing indexes, redundant indexes, and optimal column order for multicolumn indexes by applying a constraint programming model against real query execution data. This removes the guesswork and ensures that PostgreSQL databases are indexed for maximum performance.&lt;/p&gt;
&lt;h2 id=&quot;references&quot; &gt;&lt;a href=&quot;#references&quot; aria-label=&quot;references permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;References:&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/indexes-types.html#INDEXES-TYPES-BTREE&quot;&gt;PostgreSQL Documentation: 17: 11.2. Index Types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/16/btree-implementation.html#BTREE-DEDUPLICATION&quot;&gt;PostgreSQL Documentation: 16: 67.4. Deduplication&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://learn.microsoft.com/en-us/sql/relational-databases/indexes/indexes?view=sql-server-ver16&quot;&gt;Microsoft SQL Server Documentation: Indexes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-index-design-guide?view=sql-server-ver16&quot;&gt;Microsoft Blog: SQL Server and Azure SQL index architecture and design guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/index-advisor-v3&quot;&gt;pganalyze Blog: Introducing pganalyze Index Advisor 3.0&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Replacing Oracle Hints: Best Practices with pg_hint_plan on PostgreSQL]]></title><description><![CDATA[If you're migrating from Oracle Database to PostgreSQL, you're likely accustomed to using hints to optimize queries. In Oracle, these are special directives embedded in SQL (like ) that steer the optimizer's execution plan. They can be extremely useful but also introduce complexity and “hint debt” over time. PostgreSQL takes a very different approach to query optimization. Rather than supporting built-in hints, the Postgres community, historically, has emphasized relying on its cost-based…]]></description><link>https://pganalyze.com/blog/migrating-from-oracle-hints-to-pg-hint-plan-on-postgresql</link><guid isPermaLink="false">https://pganalyze.com/blog/migrating-from-oracle-hints-to-pg-hint-plan-on-postgresql</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Wed, 05 Feb 2025 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;If you&apos;re migrating from Oracle Database to PostgreSQL, you&apos;re likely accustomed to using &lt;strong&gt;hints&lt;/strong&gt; to optimize queries. In Oracle, these are special directives embedded in SQL (like &lt;code &gt;/*+ INDEX(...) */&lt;/code&gt;) that steer the optimizer&apos;s execution plan. They can be extremely useful but also introduce complexity and “hint debt” over time.&lt;/p&gt;
&lt;p&gt;PostgreSQL takes a very different approach to query optimization. Rather than supporting built-in hints, the Postgres community, historically, has emphasized relying on its cost-based planner to choose execution plans based on statistics, indexes, and configuration parameters. In practice, that works many times, but there can be cases where the planner is stubborn and keeps picking a bad plan. In migration situations, this is particularly complicated, because performance may be dependent on a particular execution plan that was previously specified using an Oracle hint.&lt;/p&gt;
&lt;p&gt;So you might ask yourself: &lt;strong&gt;how do you replicate or replace Oracle hints when you migrate to Postgres?&lt;/strong&gt; That&apos;s where the &lt;a href=&quot;https://github.com/ossc-db/pg_hint_plan&quot;&gt;&lt;strong&gt;pg_hint_plan&lt;/strong&gt;&lt;/a&gt; extension comes in.&lt;/p&gt;
&lt;p&gt;In this post, we&apos;ll explore the differences between Oracle&apos;s hint system and PostgreSQL&apos;s planner with pg_hint_plan, discuss when you still need hints in your Postgres queries, and walk through best practices for using pg_hint_plan effectively, including &lt;a href=&quot;#using-pganalyze-to-test-query-hints&quot;&gt;how pganalyze can help&lt;/a&gt;.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#when-and-when-not-to-use-hints&quot;&gt;When (and when not) to use hints&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#relying-on-postgresqls-cost-based-planner&quot;&gt;Relying on PostgreSQL&apos;s cost-based planner&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#root-causes-of-postgres-planner-problems&quot;&gt;Root causes of Postgres planner problems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#when-hints-can-help&quot;&gt;When hints can help&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#mapping-oracle-hints-to-pg_hint_plan&quot;&gt;Mapping Oracle hints to pg_hint_plan&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#access-path-or-index-hints&quot;&gt;Access path (or index) hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#join-operation-hints&quot;&gt;Join operation hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#join-order-hints&quot;&gt;Join order hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#parallel--degree-of-parallelism-hints&quot;&gt;Parallel / degree of parallelism hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#query-transformation--subquery-hints&quot;&gt;Query transformation &amp;#x26; subquery hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#result-cache-and-other-specialized-hints&quot;&gt;Result cache and other specialized hints&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#additional-pg_hint_plan-features-no-oracle-equivalent&quot;&gt;Additional pg_hint_plan Features (no Oracle equivalent)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#best-practices-for-debugging-pg_hint_plan-hints&quot;&gt;Best practices for debugging pg_hint_plan hints&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#using-pganalyze-to-test-query-hints&quot;&gt;Using pganalyze to test query hints&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#references&quot;&gt;References&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#documentation&quot;&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#5mins-of-postgres-episodes-on-planner-quirks&quot;&gt;5mins of Postgres episodes on planner quirks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#webinars--ebooks&quot;&gt;Webinars &amp;#x26; eBooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#blog-posts&quot;&gt;Blog posts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;when-and-when-not-to-use-hints&quot; &gt;&lt;a href=&quot;#when-and-when-not-to-use-hints&quot; aria-label=&quot;when and when not to use hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;When (and when not) to use hints&lt;/h2&gt;
&lt;p&gt;It might be tempting to migrate all Oracle hints into pg_hint_plan, but this can be overkill and sometimes even counterproductive in PostgreSQL. Let&apos;s talk about where hints fit into a well-tuned Postgres environment.&lt;/p&gt;
&lt;h3 id=&quot;relying-on-postgresqls-cost-based-planner&quot; &gt;&lt;a href=&quot;#relying-on-postgresqls-cost-based-planner&quot; aria-label=&quot;relying on postgresqls cost based planner permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Relying on PostgreSQL&apos;s cost-based planner&lt;/h3&gt;
&lt;p&gt;PostgreSQL is built around a cost-based planner that typically selects efficient execution paths without manual intervention. It uses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Statistics&lt;/strong&gt; on table sizes, column data distribution, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Planner cost settings&lt;/strong&gt; like &lt;code &gt;random_page_cost&lt;/code&gt; and &lt;code &gt;cpu_tuple_cost&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Server configuration parameters&lt;/strong&gt; such as &lt;code &gt;enable_seqscan&lt;/code&gt;, &lt;code &gt;work_mem&lt;/code&gt;, and &lt;code &gt;effective_cache_size&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The philosophy behind PostgreSQL&apos;s planner is that if your statistics, indexes, and cost parameters are well-tuned, the engine can usually figure out the best plan on its own, and there is rarely a need to rely on hints.&lt;/p&gt;
&lt;p&gt;However, this system isn&apos;t perfect, and Postgres sometimes picks sub-optimal plans, as we&apos;ve talked about in our Postgres &lt;a href=&quot;https://pganalyze.com/blog/migrating-from-oracle-hints-to-pg-hint-plan-on-postgresql#5mins-of-postgres-episodes-on-planner-quirks&quot;&gt;planner quirks series&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;root-causes-of-postgres-planner-problems&quot; &gt;&lt;a href=&quot;#root-causes-of-postgres-planner-problems&quot; aria-label=&quot;root causes of postgres planner problems permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Root causes of Postgres planner problems&lt;/h3&gt;
&lt;p&gt;A common problem with Postgres query plans are out of date, or incorrect statistics. Statistics about tables columns and the selectivity of query filters &lt;a href=&quot;https://pganalyze.com/webinars/how-to-optimize-slow-queries-with-EXPLAIN&quot;&gt;are critical for the planner&lt;/a&gt; to make good decisions. Frequent &lt;code &gt;ANALYZE&lt;/code&gt; operations combined with tuned statistics target settings and using &lt;code &gt;CREATE STATISTICS&lt;/code&gt;, ensure that the system captures current information about data distributions.&lt;/p&gt;
&lt;p&gt;A thoughtfully designed schema with &lt;a href=&quot;https://pganalyze.com/blog/index-advisor-v3&quot;&gt;well-chosen indexes&lt;/a&gt; and, when appropriate, table partitioning, often provides a bigger performance boost than manual hints, which can only do so much on a large table.&lt;/p&gt;
&lt;p&gt;Settings such as &lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-work-mem-tuning&quot;&gt;&lt;code &gt;work_mem&lt;/code&gt;&lt;/a&gt;, &lt;code &gt;random_page_cost&lt;/code&gt;, and &lt;code &gt;effective_cache_size&lt;/code&gt; have a significant impact on the decisions the planner makes, yet they are often set at the default value, which can cause bad query plans. &lt;a href=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;Optimizing these settings&lt;/a&gt; can resolve many query performance challenges without introducing hints. When the planner&apos;s cost model aligns well with the realities of your hardware and data, it typically arrives at better plans.&lt;/p&gt;
&lt;h3 id=&quot;when-hints-can-help&quot; &gt;&lt;a href=&quot;#when-hints-can-help&quot; aria-label=&quot;when hints can help permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;When hints can help&lt;/h3&gt;
&lt;p&gt;Despite the strengths of PostgreSQL&apos;s planner, there are times when hints prove beneficial. In fact, forcing a certain plan for debugging can offer valuable insight into why the planner&apos;s default choice might be less than ideal, and which part of the query plan had inaccurate costs, often caused by statistics issues.&lt;/p&gt;
&lt;p&gt;Legacy Oracle queries often rely heavily on hints, and adjusting them or restructuring the schema might be too risky or time-intensive. In such cases, &lt;a href=&quot;https://github.com/ossc-db/pg_hint_plan/tree/master&quot;&gt;pg_hint_plan&lt;/a&gt; can replicate specific behaviors from Oracle without a total rewrite. Hints also help in highly complex queries or unusual data distributions that consistently lead the planner astray. They are likewise useful as a temporary patch while deeper issues, such as missing statistics or incorrectly set parameters, are being addressed.&lt;/p&gt;
&lt;p&gt;When statistical accuracy, schema design, and parameter tuning are all properly addressed in Postgres, hints become an added layer of complexity rather than a necessity. Use them sparingly, focusing on special cases that truly require hard-coded logic.&lt;/p&gt;
&lt;h2 id=&quot;mapping-oracle-hints-to-pg_hint_plan&quot; &gt;&lt;a href=&quot;#mapping-oracle-hints-to-pg_hint_plan&quot; aria-label=&quot;mapping oracle hints to pg_hint_plan permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Mapping Oracle hints to pg_hint_plan&lt;/h2&gt;
&lt;p&gt;Both Oracle hints and pg_hint_plan hints are embedded in SQL statements using &lt;code &gt;/*+ ... */&lt;/code&gt;. They can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Force the use of specific indexes or join methods (e.g., nested loops)&lt;/li&gt;
&lt;li&gt;Enable or disable parallel execution&lt;/li&gt;
&lt;li&gt;Override other plan choices&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These hints can be very direct: &lt;em&gt;“Use index X on this table,”&lt;/em&gt; or &lt;em&gt;“Join table A and B using a Nested Loop Join.”&lt;/em&gt; This level of control is sometimes essential when the database optimizer doesn&apos;t pick an optimal plan on its own or when you need consistent performance across different instances.&lt;/p&gt;
&lt;p&gt;When you do decide to replicate Oracle hints in Postgres, you&apos;ll likely look for direct equivalents. pg_hint_plan supports many—but not all—Oracle-like hints. pg_hint_plan primarily controls scan methods, join methods, join order, and query parallelism. Many of Oracle&apos;s advanced hints for rewriting queries, star transformations, dynamic sampling, and specialized caching are simply not available or applicable in Postgres.&lt;/p&gt;
&lt;p&gt;Instead, in Postgres, you often achieve similar behavior by tuning planner GUCs (like &lt;code &gt;enable_hashjoin&lt;/code&gt;, &lt;code &gt;enable_nestloop&lt;/code&gt;), rewriting queries, materializing parts of the query with the &lt;code &gt;MATERIALIZED&lt;/code&gt; keyword for CTEs, or using indexes/constraints that nudge the Postgres planner.&lt;/p&gt;
&lt;p&gt;Let&apos;s review some common situations and map them from Oracle Database hints to pg_hint_plan syntax or other Postgres alternatives.&lt;/p&gt;
&lt;h3 id=&quot;access-path-or-index-hints&quot; &gt;&lt;a href=&quot;#access-path-or-index-hints&quot; aria-label=&quot;access path or index hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Access path (or index) hints&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;33%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th width=&quot;34%&quot;&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;FULL(table)&lt;/code&gt;&lt;br&gt;Force a full table scan&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;SeqScan(table)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Forces Postgres to use a sequential scan (called Full Table Scan on Oracle) on the named table.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;INDEX(table [index])&lt;/code&gt;&lt;br&gt;Force index scan&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;IndexScan(table [index])&lt;/code&gt; &lt;i&gt;or&lt;/i&gt; &lt;code&gt;IndexOnlyScan(table [index])&lt;/code&gt; &lt;i&gt;or&lt;/i&gt; &lt;code&gt;BitmapScan(table [index])&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;pg_hint_plan has separate hints for regular index scans, index-only scans, or bitmap index scans.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;INDEX_FFS(table index)&lt;/code&gt;&lt;br&gt;Fast full index scan&lt;/td&gt;
      &lt;td&gt;No direct equivalent. &lt;code&gt;IndexOnlyScan&lt;/code&gt; is approximate.&lt;/td&gt;
      &lt;td&gt;Postgres can answer a query from the index by using an IndexOnlyScan, if all filtered and returned columns are indexed. However, Postgres sometimes still checks the table to verify visibility of deleted rows (this cannot be turned off).&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;INDEX_DESC(table [index])&lt;/code&gt;&lt;br&gt;Reverse index scan&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;IndexScan&lt;/code&gt; with an &lt;code&gt;ORDER BY ... DESC&lt;/code&gt; in the query itself.&lt;/td&gt;
      &lt;td&gt;pg_hint_plan can&apos;t directly enforce a &lt;i&gt;descending&lt;/i&gt; index scan; you typically rely on query order or an index with the right sort order.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;NO_INDEX(table [index])&lt;/code&gt;&lt;br&gt;Disallow index&lt;/td&gt;
      &lt;td&gt;No equivalent.&lt;/td&gt;
      &lt;td&gt;No equivalent to disallow individual indexes.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;INDEX_JOIN(table)&lt;/code&gt;&lt;br&gt;Use index join&lt;/td&gt;
      &lt;td&gt;No equivalent.&lt;/td&gt;
      &lt;td&gt;PostgreSQL does not have a direct &quot;index join&quot; concept like Oracle.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In Oracle, you might have:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;/*+ INDEX(table1 idx_table1_col) */&lt;/span&gt; 
       col1&lt;span &gt;,&lt;/span&gt; col2
&lt;span &gt;FROM&lt;/span&gt;   table1
&lt;span &gt;WHERE&lt;/span&gt;  col1 &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;something&apos;&lt;/span&gt;
&lt;span &gt;ORDER&lt;/span&gt; &lt;span &gt;BY&lt;/span&gt; col2 &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In PostgreSQL with pg_hint_plan, you&apos;d translate it to:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*+
  IndexScan(table1 idx_table1_col)
*/&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; col1&lt;span &gt;,&lt;/span&gt; col2
&lt;span &gt;FROM&lt;/span&gt;   table1
&lt;span &gt;WHERE&lt;/span&gt;  col1 &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;something&apos;&lt;/span&gt;
&lt;span &gt;ORDER&lt;/span&gt; &lt;span &gt;BY&lt;/span&gt; col2 &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;join-operation-hints&quot; &gt;&lt;a href=&quot;#join-operation-hints&quot; aria-label=&quot;join operation hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Join operation hints&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;33%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th width=&quot;34%&quot;&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;USE_NL(table1 table2)&lt;/code&gt;&lt;br&gt;Use nested loops&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;NestLoop(table1 table2)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Forces a Nested Loop Join between the two named tables.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;USE_HASH(table1 table2)&lt;/code&gt;&lt;br&gt;Use hash join&lt;br&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;HashJoin(table1 table2)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Forces a Hash Join between the two named tables.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;USE_MERGE(table1 table2)&lt;/code&gt;&lt;br&gt;Use sort-merge join&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;MergeJoin(table1 table2)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Forces a Merge Join between the two named tables.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;USE_NL_WITH_INDEX(t1 idx1)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;NestLoop(table1 table2)&lt;/code&gt; + &lt;code&gt;IndexScan(table1 index1)&lt;/code&gt; + &lt;code&gt;Leading((table2 table1))&lt;/code&gt;&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;In order to perform what Postgres calls a &lt;a href=&quot;https://pganalyze.com/blog/how-postgres-chooses-index#parameterized-index-scans-or-why-nested-loop-are-sometimes-a-good-join-type&quot;&gt;Parameterized Index Scan&lt;/a&gt;, the hints must force both a NestedLoop, the Join Order (via Leading) and the use of the correct Index. Note that the Leading hint requires use of extra parenthesis to force the ordering. The first table listed is the outer table, followed by the inner table (which is the one the index scan is on).&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;
        &lt;code&gt;NO_USE_NL(t1 [t2...])&lt;/code&gt;&lt;br&gt;&lt;br&gt;
        &lt;code&gt;NO_USE_MERGE(t1 [t2...])&lt;/code&gt;&lt;br&gt;&lt;br&gt;
        &lt;code&gt;NO_USE_HASH(t1 [t2...])&lt;/code&gt;
      &lt;/td&gt;
      &lt;td&gt;
        &lt;code&gt;NoNestLoop(t1 t2 [t3...])&lt;/code&gt;&lt;br&gt;&lt;br&gt;
        &lt;code&gt;NoMergeJoin(t1 t2 [t3...])&lt;/code&gt;&lt;br&gt;&lt;br&gt;
        &lt;code&gt;NoHashJoin(t1 t2 [t3...])&lt;/code&gt;
      &lt;/td&gt;
      &lt;td&gt;
        pg_hint_plans instructs PostgreSQL&apos;s query planner not to use a Nested Loop/Merge/Hash join for the listed tables (which need to include both the inner and the outer table), while the Oracle hint tells the optimizer not to use a Nested Loop/Merge/Hash join for each specified table where it is the inner table of the join.
      &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&quot;join-order-hints&quot; &gt;&lt;a href=&quot;#join-order-hints&quot; aria-label=&quot;join order hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Join order hints&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;33%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th width=&quot;34%&quot;&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;ORDERED&lt;/code&gt;&lt;br&gt;Join in the order of tables in the FROM clause&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;Set(join_collapse_limit 1)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;In Postgres, setting the &lt;code&gt;join_collapse_limit&lt;/code&gt; setting to &quot;1&quot; will force Postgres to join the tables in the order they are listed in the query. You can set this either via pg_hint_plan or a regular &lt;code&gt;SET&lt;/code&gt; command before running the query. See &lt;a href=&quot;https://www.postgresql.org/docs/current/explicit-joins.html&quot;&gt;examples in the Postgres documentation&lt;/a&gt;.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;LEADING(t1 t2 ... tN)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;Leading(t1 t2 ... tN)&lt;/code&gt;&lt;br&gt;&lt;br&gt;&lt;code&gt;Leading(((t1 t2) t3))&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;pg_hint_plan supports &lt;code&gt;Leading(...)&lt;/code&gt; to fix the join order. You can list multiple tables in the desired join sequence. Use the syntax with additional parenthesis around each pair to specify which table is used as the inner vs outer table.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&quot;parallel--degree-of-parallelism-hints&quot; &gt;&lt;a href=&quot;#parallel--degree-of-parallelism-hints&quot; aria-label=&quot;parallel  degree of parallelism hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Parallel / degree of parallelism hints&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;33%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th width=&quot;34%&quot;&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;PARALLEL(table, n)&lt;/code&gt;&lt;br&gt;Parallel degree n&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;Parallel(table n hard)&lt;/code&gt;&lt;/td&gt;
      &lt;td width=&quot;34%&quot;&gt;pg_hint_plan by default (&quot;soft&quot;) only sets the configured maximum number of workers (&lt;code &gt;max_parallel_workers_per_gather&lt;/code&gt;) but won&apos;t force a parallel plan if the costs are not in its favor. You can force a parallel plan by specifying the third argument as &lt;code&gt;hard&lt;/code&gt;, which matches Oracle&apos;s behaviour when specifying a specific parallel degree.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;NO_PARALLEL(table)&lt;/code&gt;&lt;br&gt;Disallow parallel&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;Parallel(table 0)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;pg_hint_plan inhibits parallel execution when the table value is set to zero.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Example usage in &lt;strong&gt;pg_hint_plan,&lt;/strong&gt; increasing the parallel workers from the default of 2 (max_parallel_workers_per_gather) to 4 just for this query&apos;s use of the &quot;sales&quot; table:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*+
  Parallel(sales 4)
*/&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&quot;query-transformation--subquery-hints&quot; &gt;&lt;a href=&quot;#query-transformation--subquery-hints&quot; aria-label=&quot;query transformation  subquery hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Query transformation &amp;#x26; subquery hints&lt;/h3&gt;
&lt;p&gt;Oracle has many hints controlling query transformations (like unnesting subqueries, merging views, star transformations, etc.). pg_hint_plan does not provide direct equivalents for these transformations; PostgreSQL&apos;s planner transformations are generally not hint-based but either controlled automatically or by GUC parameters.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;20%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;UNNEST / NO_UNNEST&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;PostgreSQL decides automatically on subquery unnesting (lateral joins, subquery flattening, etc.), and pg_hint_plan cannot influence this. However, queries can be rewritten to use a CTE with the &lt;code&gt;NOT MATERIALIZED&lt;/code&gt; keyword, which will behave similar to Oracle&apos;s &lt;code&gt;UNNEST&lt;/code&gt;, or &lt;code&gt;MATERIALIZED&lt;/code&gt; which will behave like &lt;code&gt;NO_UNNEST&lt;/code&gt;. &lt;a href=&quot;https://www.postgresql.org/docs/current/queries-with.html&quot;&gt;See Postgres documentation&lt;/a&gt;.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;MERGE&lt;/code&gt; / &lt;code&gt;NO_MERGE&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;In Postgres, views are inlined automatically as if they were a subquery; there is no fine-grained hint for controlling this.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;PUSH_SUBQ&lt;/code&gt; / &lt;code&gt;NO_PUSH_SUBQ&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;No direct control over subquery execution in &lt;code&gt;pg_hint_plan&lt;/code&gt;.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;STAR_TRANSFORMATION&lt;/code&gt; / &lt;code&gt;NO_STAR_TRANSFORMATION&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Oracle&apos;s star transformations for data warehouse schemas have no direct counterpart in Postgres.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;FACT&lt;/code&gt; / &lt;code&gt;NO_FACT&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Oracle uses these for star schemas; not applicable in Postgres.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&quot;result-cache-and-other-specialized-hints&quot; &gt;&lt;a href=&quot;#result-cache-and-other-specialized-hints&quot; aria-label=&quot;result cache and other specialized hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Result cache and other specialized hints&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th width=&quot;33%&quot;&gt;Oracle Hint&lt;/th&gt;
      &lt;th width=&quot;20%&quot;&gt;pg_hint_plan Equivalent&lt;/th&gt;
      &lt;th width=&quot;47%&quot;&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;RESULT_CACHE&lt;/code&gt; / &lt;code&gt;NO_RESULT_CACHE&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;PostgreSQL does not have a built-in query result cache like Oracle.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;OPT_PARAM(...)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;Set(...)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Postgres parameters are typically set at the session level (&quot;SET&quot; command) or via &quot;Set&quot; hints in pg_hint_plan. Note the parameters that can be set differ between Oracle and Postgres.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;DYNAMIC_SAMPLING(...)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Postgres statistics system works based on a separate ANALYZE of the table outside of query execution and does not have an equivalent of dynamic sampling.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;QB_NAME&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;pg_hint_plan does not offer an equivalent to Oracle&apos;s query block functionality for hints.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;PUSH_PRED&lt;/code&gt; / &lt;code&gt;NO_PUSH_PRED&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Postgres handles predicate pushdown automatically based on heuristics for subqueries; no direct hint.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;USE_CONCAT&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Oracle uses this to force expansion of &lt;code&gt;OR&lt;/code&gt; clauses into &lt;code&gt;UNION ALL&lt;/code&gt; queries. Postgres does not support doing this transformation automatically, manual rewrite of the query is needed. &lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-UNION-subquery-pull-up-performance&quot;&gt;See our blog post for an example&lt;/a&gt;.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;NO_QUERY_TRANSFORMATION&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;None&lt;/td&gt;
      &lt;td&gt;Postgres&apos;s transformations during the planning process can not be turned off / modified via hints.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&quot;additional-pg_hint_plan-features-no-oracle-equivalent&quot; &gt;&lt;a href=&quot;#additional-pg_hint_plan-features-no-oracle-equivalent&quot; aria-label=&quot;additional pg_hint_plan features no oracle equivalent permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Additional pg_hint_plan Features (no Oracle equivalent)&lt;/h3&gt;
&lt;p&gt;pg_hint_plan has additional hints that don&apos;t map to Oracle hints but can be helpful:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code &gt;Rows(table1 table2 [ n ])&lt;/code&gt;: Tells the planner to assume a join between &lt;code &gt;table1 and table 2&lt;/code&gt; returns &lt;code &gt;n&lt;/code&gt; rows (replacing or adjusting the statistics-derived estimate), influencing join order and plan choices.&lt;/li&gt;
&lt;li&gt;&lt;code &gt;Memoize(table1 table2)&lt;/code&gt; / &lt;code &gt;NoMemoize(table1 table2)&lt;/code&gt;: Influences whether the Memoize functionality is applied to the given join tables. Memoize can sometimes cause Postgres planner costs to be off, and as such the “NoMemoize” hint can be useful to avoid query plans that might favor a Nested Loop Join.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;best-practices-for-debugging-pg_hint_plan-hints&quot; &gt;&lt;a href=&quot;#best-practices-for-debugging-pg_hint_plan-hints&quot; aria-label=&quot;best practices for debugging pg_hint_plan hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Best practices for debugging pg_hint_plan hints&lt;/h2&gt;
&lt;p&gt;Sometimes a pg_hint_plan hint won&apos;t take effect, and it&apos;s not always clear why that might be, as Postgres will always give you a plan, even if the pg_hint_plan hints did not take effect.&lt;/p&gt;
&lt;p&gt;The most common problems can be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Specifying multiple hint comments (if you have multiple hints you must specify them all in one &lt;code &gt;/*+ ... */&lt;/code&gt; comment)&lt;/li&gt;
&lt;li&gt;Using incorrect pg_hint_plan syntax (e.g. &lt;code &gt;NestedLoop&lt;/code&gt; instead of &lt;code &gt;NestLoop&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;The planner not having a viable path to use the hint (e.g. because the requested index can&apos;t be used for a given expression)&lt;/li&gt;
&lt;li&gt;Re-used table names not having unique aliases in a query (you need to assign an alias to each table in such situations)&lt;/li&gt;
&lt;li&gt;Hints for partitioned tables must target the partition table parent, not the children&lt;/li&gt;
&lt;li&gt;Subqueries that do not have an assigned name (i.e. are not a CTE) can &lt;a href=&quot;https://pg-hint-plan.readthedocs.io/en/latest/hint_details.html#subqueries&quot;&gt;only be hinted in some cases&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, by default you may not see any clear indication of a problem, since pg_hint_plan does not show any debug output by default.&lt;/p&gt;
&lt;p&gt;To understand better why hints may not have been used, you can enable the &lt;code &gt;pg_hint_plan.print_debug&lt;/code&gt; flag. This will give you output like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SET&lt;/span&gt; pg_hint_plan&lt;span &gt;.&lt;/span&gt;debug_print &lt;span &gt;=&lt;/span&gt; &lt;span &gt;true&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;  
&lt;span &gt;/*+ NestedLoop(table1 table2) */&lt;/span&gt; &lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; …&lt;span &gt;;&lt;/span&gt;  &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;INFO:  pg_hint_plan: hint syntax error at or near &quot;NestedLoop&quot;.  
DETAIL:  Unrecognized hint keyword &quot;NestedLoop&quot;.  
                                          QUERY PLAN                                        	   
----------------------------------------------------------------------------------------------------  
…  &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Additionally you can show more detailed output about hint usage by raising the client log level (&lt;code &gt;client_min_messages&lt;/code&gt;) to &lt;code &gt;LOG&lt;/code&gt;, which will tell you which hints were used successfully:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SET&lt;/span&gt; client_min_messages &lt;span &gt;=&lt;/span&gt; LOG&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;/*+ NestLoop(table1 table2) IndexScan(table3) */&lt;/span&gt; &lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; table1 &lt;span &gt;JOIN&lt;/span&gt; table2 
&lt;span &gt;ON&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;table2_id &lt;span &gt;=&lt;/span&gt; table2&lt;span &gt;.&lt;/span&gt;id&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; table1_id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;123&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;LOG:  pg_hint_plan:
used hint:
NestLoop(table1 table2)
not used hint:
IndexScan(table3)
duplication hint:
error hint:
                                        QUERY PLAN                                     	 
----------------------------------------------------------------------------------------------
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can find additional aspects to consider in the &lt;a href=&quot;https://pg-hint-plan.readthedocs.io/en/latest/hint_details.html&quot;&gt;pg_hint_plan documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;using-pganalyze-to-test-query-hints&quot; &gt;&lt;a href=&quot;#using-pganalyze-to-test-query-hints&quot; aria-label=&quot;using pganalyze to test query hints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using pganalyze to test query hints&lt;/h2&gt;
&lt;p&gt;Oftentimes Oracle-to-Postgres migrations run into challenges when on a deadline to complete pre-production performance testing or right after going live. In such situations, pganalyze can help you quickly iterate on different hints and benchmark query plans using &lt;a href=&quot;https://pganalyze.com/blog/introducing-postgres-query-tuning-workbooks&quot;&gt;Query Tuning Workbooks&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the following example, we compared a baseline query with a query variant that uses pg_hint_plan to choose a particular index. From these results, it&apos;s clear that implementing the hint improves performance by more than 60%, plus it&apos;s documented for the whole team to see why the change was made.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;compare-plan-with-hint&quot;
        title=&quot;compare-plan-with-hint&quot;
        src=&quot;https://pganalyze.com/static/f1d371d15d792cbdf5c535c057ff0a36/1d69c/compare-plan-with-hint.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;By iterating through this process of identifying slow queries, testing variants, and implementing optimizations, you avoid guesswork, ensure that each hint actually benefits your application, and prevent adding unnecessary complexity to your database.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Migrating Oracle hints to PostgreSQL can be a tricky process, but pg_hint_plan provides a valuable tool for those times when you really need to guide Postgres&apos; planner. Nonetheless, remember that PostgreSQL is intended to make sound decisions based on strong statistics, strategic indexing, and well-chosen cost parameters, which can all be optimized using pganalyze. Hints should serve as a targeted solution, not the default approach.&lt;/p&gt;
&lt;h2 id=&quot;references&quot; &gt;&lt;a href=&quot;#references&quot; aria-label=&quot;references permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;References&lt;/h2&gt;
&lt;h3 id=&quot;documentation&quot; &gt;&lt;a href=&quot;#documentation&quot; aria-label=&quot;documentation permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Documentation&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/ossc-db/pg_hint_plan&quot;&gt;pg_hint_plan GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pg-hint-plan.readthedocs.io/en/latest/index.html&quot;&gt;pg_hint_plan Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/Comments.html#GUID-D316D545-89E2-4D54-977F-FC97815CD62E&quot;&gt;Oracle Database - Hint documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/docs/query-tuning&quot;&gt;pganalyze Query Tuning Workbooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/explicit-joins.html&quot;&gt;PostgreSQL Documentation: 17: 14.3. Controlling the Planner with Explicit JOIN Clauses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/queries-with.html&quot;&gt;PostgreSQL Documentation: 17: 7.8. WITH Queries (Common Table Expressions)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;5mins-of-postgres-episodes-on-planner-quirks&quot; &gt;&lt;a href=&quot;#5mins-of-postgres-episodes-on-planner-quirks&quot; aria-label=&quot;5mins of postgres episodes on planner quirks permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;5mins of Postgres episodes on planner quirks&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-planner-join-equivalence-class-in-any-filters&quot;&gt;JOIN Equivalence Classes and IN/ANY filters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-planner-jsonb-selectivity&quot;&gt;How to fix bad JSONB selectivity estimates&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-planner-order-by-limit&quot;&gt;The impact of ORDER BY + LIMIT on index usage&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;webinars--ebooks&quot; &gt;&lt;a href=&quot;#webinars--ebooks&quot; aria-label=&quot;webinars  ebooks permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Webinars &amp;#x26; eBooks&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/webinars/how-to-optimize-slow-queries-with-EXPLAIN&quot;&gt;How to Optimize Slow Queries with EXPLAIN to Fix Bad Query Plans&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;Best Practices for Optimizing Postgres Query Performance&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;blog-posts&quot; &gt;&lt;a href=&quot;#blog-posts&quot; aria-label=&quot;blog posts permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Blog posts&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-work-mem-tuning&quot;&gt;The surprising logic of the Postgres work_mem setting, and how to tune it&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/index-advisor-v3&quot;&gt;Introducing pganalyze Index Advisor 3.0 - A workload-aware system for finding missing indexes in Postgres&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/how-postgres-chooses-index&quot;&gt;How Postgres Chooses Which Index To Use For A Query&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-UNION-subquery-pull-up-performance&quot;&gt;Speed up Postgres queries with UNIONs and subquery pull-up&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pganalyze.com/blog/introducing-postgres-query-tuning-workbooks&quot;&gt;Introducing Query Tuning Workbooks: Safely Tune Postgres Queries on Production with pganalyze&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Introducing pg_query for Postgres 16 - Parsing SQL/JSON, Windows support, PL/pgSQL parse mode & more]]></title><description><![CDATA[Parsing SQL queries and turning them into a syntax tree is not a simple task. Especially when you want to support special syntax that is specific to a particular database engine, like Postgres. And when you’re working with queries day in day out, like we do at pganalyze, understanding the actual intent of a query, which tables it scans, which columns it filters on, and such, is essential. Almost 10 years ago, we determined that in order to create the best product for monitoring and optimizing…]]></description><link>https://pganalyze.com/blog/pg-query-postgres-16</link><guid isPermaLink="false">https://pganalyze.com/blog/pg-query-postgres-16</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 11 Jan 2024 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Parsing SQL queries and turning them into a syntax tree is not a simple task. Especially when you want to support special syntax that is specific to a particular database engine, like Postgres. And when you’re working with queries day in day out, like we do at pganalyze, &lt;strong&gt;understanding the actual intent of a query, which tables it scans, which columns it filters on, and such, is essential.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Almost 10 years ago, we determined that in order to create the best product for monitoring and optimizing Postgres, we needed to parse queries the way that Postgres does. &lt;strong&gt;We released the first version of pg_query back in 2014&lt;/strong&gt;, and have seen many different projects outside of pganalyze utilize our open-source project. For example, to support migration use cases, create linting tools, or check which queries an application executes (see our &lt;a href=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser&quot;&gt;post from 2021&lt;/a&gt; for some examples). And to name just one vanity metric, the &lt;a href=&quot;https://rubygems.org/gems/pg_query&quot;&gt;Ruby binding for pg_query&lt;/a&gt; has been downloaded an incredible 34 million times!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Today, we’re excited to announce the new &lt;a href=&quot;https://github.com/pganalyze/libpg_query/releases/tag/16-5.1.0&quot;&gt;pg_query release&lt;/a&gt;&lt;/strong&gt; based on the Postgres 16 parser, which introduces support for running on Windows (a frequently requested addition), alternate query parse modes (e.g. to parse PL/pgSQL assignments), as well as parsing and deparsing new Postgres syntax, such as SQL/JSON. We’ve released updated &lt;a href=&quot;https://github.com/pganalyze/pg_query&quot;&gt;Ruby&lt;/a&gt;, &lt;a href=&quot;https://github.com/pganalyze/pg_query.rs&quot;&gt;Rust&lt;/a&gt; and &lt;a href=&quot;https://github.com/pganalyze/pg_query_go&quot;&gt;Go&lt;/a&gt; bindings, and expect bindings maintained by the community, such as for Node.js and Python, to be updated soon as well.&lt;/p&gt;
&lt;p&gt;In this post, we showcase how to use pg_query in your application, and a few benefits of the new release. But first, let’s go back to the basics - how does pg_query work?&lt;/p&gt;
&lt;h2 id=&quot;pg_query-the-postgres-parser-as-a-standalone-c-library&quot; &gt;&lt;a href=&quot;#pg_query-the-postgres-parser-as-a-standalone-c-library&quot; aria-label=&quot;pg_query the postgres parser as a standalone c library permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;pg_query, the Postgres parser as a standalone C library&lt;/h2&gt;
&lt;p&gt;At its core, pg_query is all about making the “raw_parser” function from Postgres available. We’ve &lt;a href=&quot;https://pganalyze.com/blog/parse-postgresql-queries-in-ruby&quot;&gt;written about this in more detail in the original pg_query announcement&lt;/a&gt;, but the quick summary is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We apply a tiny amount of patches on top of Postgres, e.g. to help with parsing $n parameter references in queries from pg_stat_statements&lt;/li&gt;
&lt;li&gt;We utilize libclang to build a tree of dependencies between functions and global variables in the Postgres source code&lt;/li&gt;
&lt;li&gt;In some cases, we apply mocks to avoid entering parts of Postgres we don’t need (e.g., functions that access the file system)&lt;/li&gt;
&lt;li&gt;We locate all the source code necessary for the functions we want to call (like “raw_parser”), and remove all other code, to make sure the compiler doesn’t do unnecessary work, or pull in functionality we don’t need&lt;/li&gt;
&lt;li&gt;From the built-in node definitions (which are C structs), we automatically create output functions for JSON and protocol buffers, to make it convenient to write bindings in other programming languages&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Overall, this results in a library that can parse SQL text and return a Postgres parse tree for you to work with and modify, whilst supporting the full syntax that Postgres itself supports.&lt;/p&gt;
&lt;p&gt;From an end user perspective that means you can, for example in the Ruby library, use the following code to parse a query, and find out which table it&apos;s querying:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;require&lt;/span&gt; &lt;span &gt;&apos;pg_query&apos;&lt;/span&gt;
parsed_query &lt;span &gt;=&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM users&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
puts parsed_query&lt;span &gt;.&lt;/span&gt;tree&lt;span &gt;.&lt;/span&gt;stmts&lt;span &gt;.&lt;/span&gt;first&lt;span &gt;.&lt;/span&gt;stmt&lt;span &gt;.&lt;/span&gt;select_stmt&lt;span &gt;.&lt;/span&gt;from_clause&lt;span &gt;.&lt;/span&gt;first&lt;span &gt;.&lt;/span&gt;range_var&lt;span &gt;.&lt;/span&gt;inspect
&lt;span &gt;# =&gt; &amp;lt;PgQuery::RangeVar: catalogname: &quot;&quot;, schemaname: &quot;&quot;, relname: &quot;users&quot;, inh: true, relpersistence: &quot;p&quot;, location: 14&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The parse tree structs are automatically generated as &lt;a href=&quot;https://github.com/pganalyze/libpg_query/blob/16-latest/protobuf/pg_query.proto&quot;&gt;protocol buffer definitions&lt;/a&gt; based on Postgres’ internal structs located in &lt;a href=&quot;https://github.com/postgres/postgres/blob/REL_16_STABLE/src/include/nodes/parsenodes.h&quot;&gt;parsenodes.h&lt;/a&gt; and adjacent files, and the language-specific bindings can use each language’s protobuf libraries to have properly typed structs as well.&lt;/p&gt;
&lt;p&gt;The main change in the core parsing functionality in this release is that we’ve added support for compiling libpg_query on Windows (with either MSVC, or an MSYS2 stack using MinGW/etc), a &lt;a href=&quot;https://github.com/pganalyze/libpg_query/issues/44&quot;&gt;frequently requested feature&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;using-query-fingerprints-to-identify-queries-across-servers&quot; &gt;&lt;a href=&quot;#using-query-fingerprints-to-identify-queries-across-servers&quot; aria-label=&quot;using query fingerprints to identify queries across servers permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using query fingerprints to identify queries across servers&lt;/h2&gt;
&lt;p&gt;Besides parsing itself, there was another major use case that we needed to solve for pganalyze: &lt;strong&gt;The ability to group queries together.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Postgres itself generates a “queryid” to support this. Originally part of pg_stat_statements, it has been part of Postgres core since Postgres 14, and is generated when “compute_query_id” is enabled (automatically done when using pg_stat_statements). However, the Postgres queryid has its flaws: Besides not always grouping together as well as it could (e.g. in the case of IN lists), it’s not portable. If you ran the same query on two different servers, you would get two different query IDs. This difference in query IDs is primarily explained by the fact that Postgres determines which tables a query references based on the relation OIDs. But those OIDs are not stable across servers, as they are internal identifiers.&lt;/p&gt;
&lt;p&gt;With the &lt;strong&gt;pg_query fingerprint&lt;/strong&gt; we intentionally went another way: We utilize the name (and schema) of the table, as it is present in the raw parse tree that pg_query has access to, when generating a unique identifier for a query.&lt;/p&gt;
&lt;p&gt;There are of course many other parts of a query we also take into consideration, e.g. referenced columns, expressions, functions, etc. To enable grouping we do not include constant values in the fingerprint, to ensure that two similar queries get the same fingerprint:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;fingerprint&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM users WHERE id = 1&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;# =&gt; &quot;a0ead580058af585&quot;&lt;/span&gt;
&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;fingerprint&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM users WHERE id = 2&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;# =&gt; &quot;a0ead580058af585&quot;&lt;/span&gt;
&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;fingerprint&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM users WHERE email = $1&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;# =&gt; &quot;e213d9d32c7097d5&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;What else can we use fingerprints for? One use case that we’ve heard about from pganalyze customers, is to use query fingerprints to help identify the same query on both the application side and the database.&lt;/p&gt;
&lt;p&gt;Specifically, by using pg_query in application side tracing to tag a query, and then, when looking at a slow trace, using that data in pganalyze to find more detailed information about database-side performance. This also inspired our &lt;a href=&quot;https://pganalyze.com/docs/opentelemetry&quot;&gt;recent integration with OpenTelemetry&lt;/a&gt;, which solves the same use case in a slightly different way.&lt;/p&gt;
&lt;h2 id=&quot;utilizing-deparsing-to-upgrade-queries-to-postgres-16-sqljson-syntax&quot; &gt;&lt;a href=&quot;#utilizing-deparsing-to-upgrade-queries-to-postgres-16-sqljson-syntax&quot; aria-label=&quot;utilizing deparsing to upgrade queries to postgres 16 sqljson syntax permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Utilizing deparsing to upgrade queries to Postgres 16 SQL/JSON syntax&lt;/h2&gt;
&lt;p&gt;Now to something new in the Postgres 16 release! In Postgres 16, one of the bigger syntax changes was the addition of SQL/JSON. And pg_query fully supports that, both for parsing, as well as deparsing (which allows you to turn a syntax tree back into a SQL statement).&lt;/p&gt;
&lt;p&gt;We can use the pg_query deparser to write the equivalent of a codemod for SQL statements, that rewrites the legacy syntax into the more standard SQL/JSON syntax.&lt;/p&gt;
&lt;p&gt;For example, imagine we have many places where we build JSON objects manually in SQL using the “json_build_object” function, and wanted to replace that with the new JSON_OBJECT syntax:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;q &lt;span &gt;=&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT json_build_object(&apos;key1&apos;, 1, &apos;key2&apos;, &apos;val&apos;);&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
q&lt;span &gt;.&lt;/span&gt;walk&lt;span &gt;!&lt;/span&gt; &lt;span &gt;do&lt;/span&gt; &lt;span &gt;|&lt;/span&gt;node&lt;span &gt;|&lt;/span&gt;
  &lt;span &gt;next&lt;/span&gt; &lt;span &gt;unless&lt;/span&gt; node&lt;span &gt;.&lt;/span&gt;is_a&lt;span &gt;?&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;Node&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;&amp;amp;&amp;amp;&lt;/span&gt; node&lt;span &gt;.&lt;/span&gt;node &lt;span &gt;==&lt;/span&gt; &lt;span &gt;:func_call&lt;/span&gt;
  func_name &lt;span &gt;=&lt;/span&gt; node&lt;span &gt;.&lt;/span&gt;func_call&lt;span &gt;.&lt;/span&gt;funcname&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;string&lt;span &gt;.&lt;/span&gt;sval
  &lt;span &gt;if&lt;/span&gt; func_name &lt;span &gt;==&lt;/span&gt; &lt;span &gt;&apos;json_build_object&apos;&lt;/span&gt;
    exprs &lt;span &gt;=&lt;/span&gt; node&lt;span &gt;.&lt;/span&gt;func_call&lt;span &gt;.&lt;/span&gt;args&lt;span &gt;.&lt;/span&gt;each_slice&lt;span &gt;(&lt;/span&gt;&lt;span &gt;2&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;map &lt;span &gt;do&lt;/span&gt; &lt;span &gt;|&lt;/span&gt;key&lt;span &gt;,&lt;/span&gt; value&lt;span &gt;|&lt;/span&gt;
      &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;Node&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;from&lt;span &gt;(&lt;/span&gt;
        &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;JsonKeyValue&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;new&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;
          key&lt;span &gt;:&lt;/span&gt; key&lt;span &gt;,&lt;/span&gt;
          value&lt;span &gt;:&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;JsonValueExpr&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;new&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;raw_expr&lt;span &gt;:&lt;/span&gt; value&lt;span &gt;)&lt;/span&gt;
        &lt;span &gt;)&lt;/span&gt;
      &lt;span &gt;)&lt;/span&gt;
    &lt;span &gt;end&lt;/span&gt;
    node&lt;span &gt;.&lt;/span&gt;inner &lt;span &gt;=&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;JsonObjectConstructor&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;new&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;exprs&lt;span &gt;:&lt;/span&gt; exprs&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;end&lt;/span&gt;
&lt;span &gt;end&lt;/span&gt;
q&lt;span &gt;.&lt;/span&gt;deparse
&lt;span &gt;# =&gt; &quot;SELECT JSON_OBJECT(&apos;key1&apos;: 1, &apos;key2&apos;: &apos;val&apos;)&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Each release, we test the pg_query deparser for completeness with the full set of Postgres regression tests, and be it SQL/JSON, or other new syntax, you can rest assured that pg_query supports it.&lt;/p&gt;
&lt;h2 id=&quot;alternate-parse-modes-to-work-with-plpgsql-expressions&quot; &gt;&lt;a href=&quot;#alternate-parse-modes-to-work-with-plpgsql-expressions&quot; aria-label=&quot;alternate parse modes to work with plpgsql expressions permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Alternate parse modes to work with PL/pgSQL expressions&lt;/h2&gt;
&lt;p&gt;Since &lt;a href=&quot;https://github.com/postgres/postgres/commit/844fe9f159a948377907a63d0ef3fb16dc51ce50&quot;&gt;Postgres 14&lt;/a&gt;, PL/pgSQL expressions are now parsed through the regular “raw_parser” functionality, by passing a special mode flag that then allows for PL/pgSQL specific syntax.&lt;/p&gt;
&lt;p&gt;We didn’t support this in pg_query before, but thanks to &lt;a href=&quot;https://github.com/pganalyze/libpg_query/pull/216&quot;&gt;a contribution by Landan Cheruka&lt;/a&gt;, there is now a way to parse PL/pgSQL expressions directly with pg_query.&lt;/p&gt;
&lt;p&gt;Let’s first utilize parse_plpgsql to parse a function definition, the example taken from the Postgres documentation:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; &lt;span &gt;REPLACE&lt;/span&gt; &lt;span &gt;FUNCTION&lt;/span&gt; cs_fmt_browser_version&lt;span &gt;(&lt;/span&gt;v_name &lt;span &gt;varchar&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                                              	  v_version &lt;span &gt;varchar&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;RETURNS&lt;/span&gt; &lt;span &gt;varchar&lt;/span&gt; &lt;span &gt;AS&lt;/span&gt; $$
&lt;span &gt;BEGIN&lt;/span&gt;
  &lt;span &gt;IF&lt;/span&gt; v_version &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt; &lt;span &gt;THEN&lt;/span&gt;
	&lt;span &gt;RETURN&lt;/span&gt; v_name&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;END&lt;/span&gt; &lt;span &gt;IF&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;RETURN&lt;/span&gt; v_name &lt;span &gt;||&lt;/span&gt; &lt;span &gt;&apos;/&apos;&lt;/span&gt; &lt;span &gt;||&lt;/span&gt; v_version&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;END&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;$$&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;{&lt;/span&gt;
  &lt;span &gt;&quot;PLpgSQL_function&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;&quot;datums&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
      &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_var&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;refname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;v_name&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;datatype&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;typname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;UNKNOWN&quot;&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_var&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;refname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;v_version&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;datatype&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;typname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;UNKNOWN&quot;&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_var&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;refname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;found&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;datatype&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;PLpgSQL_type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;typname&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;UNKNOWN&quot;&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;
    &lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
    &lt;span &gt;&quot;action&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
      &lt;span &gt;&quot;PLpgSQL_stmt_block&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
        &lt;span &gt;&quot;body&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;PLpgSQL_stmt_if&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                &lt;span &gt;&quot;PLpgSQL_expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;query&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;v_version IS NULL&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;parseMode&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;
              &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;then_body&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
                &lt;span &gt;{&lt;/span&gt;
                  &lt;span &gt;&quot;PLpgSQL_stmt_return&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                    &lt;span &gt;&quot;expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                      &lt;span &gt;&quot;PLpgSQL_expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;query&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;v_name&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;parseMode&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;
                    &lt;span &gt;}&lt;/span&gt;
...
            &lt;span &gt;&quot;PLpgSQL_stmt_return&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                &lt;span &gt;&quot;PLpgSQL_expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt; &lt;span &gt;&quot;query&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;v_name || &apos;/&apos; || v_version&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;parseMode&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt; &lt;span &gt;}&lt;/span&gt;
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this function parse tree, you can see the different PLpgSQL_expr expressions, but the actual expression is just text. We can now use the new pg_query_parse_opt function to turn that text into a parse tree:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;&lt;span &gt;#&lt;/span&gt;&lt;span &gt;include&lt;/span&gt; &lt;span &gt;&amp;lt;pg_query.h&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span &gt;&lt;span &gt;#&lt;/span&gt;&lt;span &gt;include&lt;/span&gt; &lt;span &gt;&amp;lt;stdio.h&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span &gt;&lt;span &gt;#&lt;/span&gt;&lt;span &gt;include&lt;/span&gt; &lt;span &gt;&amp;lt;stdlib.h&gt;&lt;/span&gt;&lt;/span&gt;

&lt;span &gt;int&lt;/span&gt; &lt;span &gt;main&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
  PgQueryParseResult result&lt;span &gt;;&lt;/span&gt;

  result &lt;span &gt;=&lt;/span&gt; &lt;span &gt;pg_query_parse_opts&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;v_name || &apos;/&apos; || v_version&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; PG_QUERY_PARSE_PLPGSQL_EXPR&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

  &lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;result&lt;span &gt;.&lt;/span&gt;error&lt;span &gt;)&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
	&lt;span &gt;printf&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;error: %s at %d\n&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; result&lt;span &gt;.&lt;/span&gt;error&lt;span &gt;-&gt;&lt;/span&gt;message&lt;span &gt;,&lt;/span&gt; result&lt;span &gt;.&lt;/span&gt;error&lt;span &gt;-&gt;&lt;/span&gt;cursorpos&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;}&lt;/span&gt; &lt;span &gt;else&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
	&lt;span &gt;printf&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;%s\n&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; result&lt;span &gt;.&lt;/span&gt;parse_tree&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;}&lt;/span&gt;

  &lt;span &gt;pg_query_free_parse_result&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;result&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

  &lt;span &gt;return&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And that gives us a regular parse tree to work with:&lt;/p&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;{&lt;/span&gt;
	&lt;span &gt;&quot;version&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;160001&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
	&lt;span &gt;&quot;stmts&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
    	&lt;span &gt;{&lt;/span&gt;
        	&lt;span &gt;&quot;stmt&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
            	&lt;span &gt;&quot;SelectStmt&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                	&lt;span &gt;&quot;targetList&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
                    	&lt;span &gt;{&lt;/span&gt;
                        	&lt;span &gt;&quot;ResTarget&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                            	&lt;span &gt;&quot;val&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                                	&lt;span &gt;&quot;A_Expr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                                    	&lt;span &gt;&quot;kind&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;AEXPR_OP&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
…&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We’re still in the process of updating language bindings to support optionally using these parse modes, and would be curious to hear about more use cases for working with PL/pgSQL and pg_query.&lt;/p&gt;
&lt;h2 id=&quot;a-shout-out-to-the-community&quot; &gt;&lt;a href=&quot;#a-shout-out-to-the-community&quot; aria-label=&quot;a shout out to the community permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;A shout-out to the community&lt;/h2&gt;
&lt;p&gt;pg_query wouldn’t be the same without the community!&lt;/p&gt;
&lt;p&gt;We want to expressly call out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/lelit&quot;&gt;Lele Gaifax&lt;/a&gt; for maintaining the Python binding “pglast” and proactively testing libpg_query PRs&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/lcheruka&quot;&gt;Landan Cheruka&lt;/a&gt; for adding &lt;a href=&quot;https://github.com/pganalyze/libpg_query/pull/216&quot;&gt;support for alternate parse modes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/anuraaga&quot;&gt;Anuraag Agrawal&lt;/a&gt; for contributions to enable use in WebAssembly (see &lt;a href=&quot;https://github.com/wasilibs/go-pgquery&quot;&gt;pg_query_go without cgo&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/emin100&quot;&gt;Mehmet Emin KARAKAŞ&lt;/a&gt; for the many deparser improvements over the years&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/psteinroe&quot;&gt;Philipp Steinrötter&lt;/a&gt; for creating the &lt;a href=&quot;https://github.com/supabase/postgres_lsp&quot;&gt;Postgres Language Server&lt;/a&gt; based on pg_query.rs, and giving lots of good feedback on how things could work better&lt;/li&gt;
&lt;li&gt;And everyone else who contributed to libpg_query and related projects!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Looking ahead, we’re also looking forward to continued conversations with the Postgres community on how we could upstream parts of pg_query as a core part of Postgres, so a query parsing library could be provided directly as part of Postgres.&lt;/p&gt;
&lt;h2 id=&quot;in-conclusion&quot; &gt;&lt;a href=&quot;#in-conclusion&quot; aria-label=&quot;in conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;In conclusion&lt;/h2&gt;
&lt;p&gt;We’re excited about the &lt;strong&gt;&lt;a href=&quot;https://github.com/pganalyze/libpg_query&quot;&gt;new pg_query version&lt;/a&gt;&lt;/strong&gt;, and we’re always happy to hear about new use cases you find for using it to work with Postgres queries. If you have ideas on how pg_query could be better, feel free to &lt;a href=&quot;https://github.com/pganalyze/libpg_query&quot;&gt;open an issue on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;And if you’ve benefited from pg_query in the past, and have not yet tried out pganalyze to optimize your Postgres performance, you can &lt;a href=&quot;https://app.pganalyze.com/users/sign_up&quot;&gt;try out pganalyze with our free 14-day trial&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Postgres 16: Cumulative I/O statistics with pg_stat_io]]></title><description><![CDATA[One of the most common questions I get from people running Postgres databases at scale is:
How do I optimize the I/O operations of my database? Historically, getting a complete picture of all the I/O produced by a Postgres server has been challenging. To start with, Postgres splits its I/O activity into writing the WAL stream, and reads/writes to the data directory. The real challenge is understanding second-order effects around writes: Typically the write to the data directory happens after the…]]></description><link>https://pganalyze.com/blog/pg-stat-io</link><guid isPermaLink="false">https://pganalyze.com/blog/pg-stat-io</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Tue, 14 Feb 2023 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;One of the most common questions I get from people running Postgres databases at scale is:&lt;br /&gt;
&lt;strong&gt;How do I optimize the I/O operations of my database?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Historically, getting a complete picture of all the I/O produced by a Postgres server has been challenging. To start with, Postgres splits its I/O activity into writing the WAL stream, and reads/writes to the data directory. &lt;strong&gt;The real challenge is understanding second-order effects around writes&lt;/strong&gt;: Typically the write to the data directory happens after the transaction commits, and understanding which process actually writes to the data directory (and when) is hard.&lt;/p&gt;
&lt;p&gt;This whole situation has become an even bigger challenge in the cloud, when faced with provisioned IOPS, or worse, having to pay for individual I/Os like on Amazon Aurora. Often the solution has been to look at parts of the system that have instrumentation (such as individual queries), to get at least some sense for where the activity is happening.&lt;/p&gt;
&lt;p&gt;Last weekend, a &lt;strong&gt;major improvement to the visibility into I/O activity&lt;/strong&gt; &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a9c70b46dbe152e094f137f7e6ba9cd3a638ee25&quot;&gt;was committed&lt;/a&gt; to the upcoming Postgres 16 by Andres Freund, and authored by Melanie Plageman, with documentation contributed by Samay Sharma. My colleague Maciek Sakrejda and I have reviewed this patch through its various iterations, and we&apos;re very excited about what it brings to Postgres observability.&lt;/p&gt;
&lt;p&gt;Welcome, &lt;strong&gt;pg_stat_io&lt;/strong&gt;. Let&apos;s take a look:&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#querying-system-wide-io-statistics-in-postgres&quot;&gt;Querying system-wide I/O statistics in Postgres&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#use-cases-for-pg_stat_io&quot;&gt;Use cases for pg_stat_io&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#tracking-write-io-activity-in-postgres&quot;&gt;Tracking Write I/O activity in Postgres&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#improve-workload-stability-and-sizing-shared_buffers-by-monitoring-shared-buffer-evictions&quot;&gt;Improve workload stability and sizing shared_buffers by monitoring shared buffer evictions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#tracking-cumulative-io-activity-by-autovacuum-and-manual-vacuums&quot;&gt;Tracking cumulative I/O activity by autovacuum and manual VACUUMs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#visibility-into-bulk-readwrite-strategies-sequential-scans-and-copy&quot;&gt;Visibility into bulk read/write strategies (sequential scans and COPY)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#sneak-peek-visualizing-pg_stat_io-in-pganalyze&quot;&gt;Sneak peek: Visualizing pg_stat_io in pganalyze&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#the-future-of-io-observability-in-postgres&quot;&gt;The future of I/O observability in Postgres&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;querying-system-wide-io-statistics-in-postgres&quot; &gt;&lt;a href=&quot;#querying-system-wide-io-statistics-in-postgres&quot; aria-label=&quot;querying system wide io statistics in postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Querying system-wide I/O statistics in Postgres&lt;/h2&gt;
&lt;p&gt;Let&apos;s start by using a local Postgres built fresh from the development branch. Note that Postgres 16 is still under heavy development, not even at beta stage, and should definitely not be used on production. For this I followed the &lt;a href=&quot;https://wiki.postgresql.org/wiki/Meson&quot;&gt;new cheatsheet for using the Meson build system&lt;/a&gt; (also new in Postgres 16), which significantly speeds up the build and test process.&lt;/p&gt;
&lt;p&gt;We can start by querying &lt;code &gt;pg_stat_io&lt;/code&gt; to get a sense for which information is tracked, omitting rows that are empty:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stat_io &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;reads&lt;/span&gt; &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; writes &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; extends &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;    backend_type     | io_object | io_context |  reads   | writes  | extends | op_bytes | evictions |  reuses  | fsyncs |          stats_reset          
---------------------+-----------+------------+----------+---------+---------+----------+-----------+----------+--------+-------------------------------
 autovacuum launcher | relation  | normal     |       19 |       5 |         |     8192 |        13 |          |      0 | 2023-02-13 11:50:27.583875-08
 autovacuum worker   | relation  | normal     |    15972 |    2494 |    2894 |     8192 |     17430 |          |      0 | 2023-02-13 11:50:27.583875-08
 autovacuum worker   | relation  | vacuum     |  5754853 | 3006563 |       0 |     8192 |      2056 |  5752594 |        | 2023-02-13 11:50:27.583875-08
 client backend      | relation  | bulkread   | 25832582 |  626900 |         |     8192 |    753962 | 25074439 |        | 2023-02-13 11:50:27.583875-08
 client backend      | relation  | bulkwrite  |     4654 | 2858085 | 3259572 |     8192 |    998220 |  2209070 |        | 2023-02-13 11:50:27.583875-08
 client backend      | relation  | normal     |   960291 |  376524 |  159497 |     8192 |   1103707 |          |      0 | 2023-02-13 11:50:27.583875-08
 client backend      | relation  | vacuum     |   128710 |       0 |       0 |     8192 |      1221 |   127489 |        | 2023-02-13 11:50:27.583875-08
 background worker   | relation  | bulkread   | 39059938 |  590896 |         |     8192 |    802939 | 38253662 |        | 2023-02-13 11:50:27.583875-08
 background worker   | relation  | normal     |   257533 |  118972 |       0 |     8192 |    256437 |          |      0 | 2023-02-13 11:50:27.583875-08
 background writer   | relation  | normal     |          |  243142 |         |     8192 |           |          |      0 | 2023-02-13 11:50:27.583875-08
 checkpointer        | relation  | normal     |          |  390141 |         |     8192 |           |          |  18812 | 2023-02-13 11:50:27.583875-08
 standalone backend  | relation  | bulkwrite  |        0 |       0 |       8 |     8192 |         0 |        0 |        | 2023-02-13 11:50:27.583875-08
 standalone backend  | relation  | normal     |      689 |     983 |     470 |     8192 |         0 |          |      0 | 2023-02-13 11:50:27.583875-08
 standalone backend  | relation  | vacuum     |       10 |       0 |       0 |     8192 |         0 |        0 |        | 2023-02-13 11:50:27.583875-08
(14 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;At a high level, this information can be interpreted as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Statistics are tracked for a given backend type, I/O object type (i.e. whether it&apos;s a temporary table), and I/O context (more on that later)&lt;/li&gt;
&lt;li&gt;The main statistics are counting I/O operations: &lt;strong&gt;reads&lt;/strong&gt;, &lt;strong&gt;writes&lt;/strong&gt; and &lt;strong&gt;extends&lt;/strong&gt; (a special kind of write to resize data files)&lt;/li&gt;
&lt;li&gt;For each I/O operation the size in bytes is noted to help interpret the statistics (currently always block size, i.e., usually 8kB)&lt;/li&gt;
&lt;li&gt;Additionally, the number of shared buffer evictions, ring buffer re-uses and fsync calls are tracked&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;On Postgres 16, this system-wide information will always available.&lt;/strong&gt; You can find the complete details of each field in the &lt;a href=&quot;https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-IO-VIEW&quot;&gt;Postgres documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Note that &lt;code &gt;pg_stat_io&lt;/code&gt; shows logical I/O operations issued by Postgres. Whilst this often eventually maps to an actual I/O to a disk (especially in the case of writes), the operating system has its own caching and batching mechanism, and will for example often times split up an 8kB write to become two individual 4kB writes to the file system.&lt;/p&gt;
&lt;p&gt;Generally we can assume that this captures all I/O issued by Postgres, except for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I/O for writing the Write-Ahead-Log (WAL)&lt;/li&gt;
&lt;li&gt;Special cases such as tables being moved between tablespaces&lt;/li&gt;
&lt;li&gt;Temporary files (such as used for sorts, or extensions like &lt;code &gt;pg_stat_statements&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that temporary relations are tracked (they are not the same as temporary files): In &lt;code &gt;pg_stat_io&lt;/code&gt; these are marked as &lt;code &gt;io_object = &quot;temp relation&quot;&lt;/code&gt; - you may otherwise be familiar with them being called &quot;local buffers&quot; in other statistics views.&lt;/p&gt;
&lt;p&gt;With the basics in place, we can take a closer look at some use cases and learn why this matters.&lt;/p&gt;
&lt;h2 id=&quot;use-cases-for-pg_stat_io&quot; &gt;&lt;a href=&quot;#use-cases-for-pg_stat_io&quot; aria-label=&quot;use cases for pg_stat_io permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Use cases for pg_stat_io&lt;/h2&gt;
&lt;h3 id=&quot;tracking-write-io-activity-in-postgres&quot; &gt;&lt;a href=&quot;#tracking-write-io-activity-in-postgres&quot; aria-label=&quot;tracking write io activity in postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Tracking Write I/O activity in Postgres&lt;/h3&gt;
&lt;figure&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/62bff7b564c8c03267aab13d95edf800/ca98b/write_lifecycle.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Lifecycle of a write in Postgres&quot; title=&quot;Lifecycle of a write in Postgres&quot; src=&quot;https://pganalyze.com/static/62bff7b564c8c03267aab13d95edf800/1d69c/write_lifecycle.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;figcaption&gt;Lifecycle of a write in Postgres, and what is currently not visible in most statistics&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;When looking at a write in Postgres, we need to look beyond what a client sees as the query runtime, or something like pg_stat_statements can track. Postgres has a complex set of mechanisms that guarantee durability of writes, whilst allowing
clients to return quickly, trusting that the server has persisted the data in a crash safe manner.&lt;/p&gt;
&lt;p&gt;The first thing that Postgres does to persist data, is to &lt;strong&gt;write it to the WAL log.&lt;/strong&gt; Once this has succeeded, the client
will receive confirmation that the write has been successful. But what happens afterwards is where the additional
statistics tracking comes in handy.&lt;/p&gt;
&lt;p&gt;For example, if you look at a given INSERT statement in pg_stat_statements, the &lt;code &gt;shared_blks_written&lt;/code&gt; field is often going to tell you next to nothing, because the actual write to the data directory typically occurs at a later time, in order to batch writes for efficiency and to avoid I/O spikes.&lt;/p&gt;
&lt;p&gt;In addition to writing the WAL, &lt;strong&gt;Postgres will also update the shared (or local) buffers for the write.&lt;/strong&gt; Such an update
will mark the buffer page in question as &quot;dirty&quot;.&lt;/p&gt;
&lt;p&gt;Then, in most cases, another process is responsible for actually
writing the dirty page to the data directory. There are three main process types to consider:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The background writer:&lt;/strong&gt; Runs continuously in the background to write out (some) dirty pages&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The checkpointer:&lt;/strong&gt; Runs on a scheduled basis, or based on amount of WAL written, and writes out all dirty pages not yet written&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;All other process types&lt;/strong&gt;, including regular client backends: Write out dirty pages if they need to evict the buffer page in question&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The main thing to understand is when the third case occurs - because &lt;strong&gt;it can drastically slow down queries&lt;/strong&gt;. Even a simple &quot;SELECT&quot; might have to suddenly write to disk, before it has enough space in shared buffers to read in its data.&lt;/p&gt;
&lt;p&gt;Historically you were already able to see some of this activity through the &lt;code &gt;pg_stat_bgwriter&lt;/code&gt; view, specifically the fields named &lt;code &gt;buffers_&lt;/code&gt;. However, this was incomplete, did not consider autovacuum activity explicitly, and did not let you understand the root cause of a write (e.g. a buffer eviction).&lt;/p&gt;
&lt;p&gt;With &lt;code &gt;pg_stat_io&lt;/code&gt; you can simply look at the &lt;code &gt;writes&lt;/code&gt; field, and see both an accurate aggregate number, as well as exactly which process in Postgres actually ended up writing your data to disk.&lt;/p&gt;
&lt;h3 id=&quot;improve-workload-stability-and-sizing-shared_buffers-by-monitoring-shared-buffer-evictions&quot; &gt;&lt;a href=&quot;#improve-workload-stability-and-sizing-shared_buffers-by-monitoring-shared-buffer-evictions&quot; aria-label=&quot;improve workload stability and sizing shared_buffers by monitoring shared buffer evictions permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Improve workload stability and sizing shared_buffers by monitoring shared buffer evictions&lt;/h3&gt;
&lt;p&gt;One of the most important metrics that &lt;code &gt;pg_stat_io&lt;/code&gt; helps give clarity on, is the situation where a buffer page in shared buffers is evicted. Since shared buffers is a fixed size pool of pages (each 8kb in size, on most Postgres systems), what is cached inside it matters a great deal - &lt;strong&gt;especially when your working set exceeds shared buffers&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;By default, if you&apos;re on a self-managed Postgres, the &lt;code &gt;shared_buffers&lt;/code&gt; setting is set to 128MB - or about 16,000 pages. Let&apos;s imagine you end up having loaded something through a very inefficient index scan, that ended up consuming all 128MB.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens when you suddenly read something completely different?&lt;/strong&gt; Postgres has to go and remove some of the old data from cache - also known as evicting a buffer page.&lt;/p&gt;
&lt;p&gt;This eviction has two main effects:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Data that was in Postgres buffer cache before, is no longer in the cache (note it may still be in the OS page cache)&lt;/li&gt;
&lt;li&gt;If the page that was evicted was marked as &quot;dirty&quot;, the process evicting it also has to write the old page to disk&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Both of these aspects matter for sizing shared buffers, and &lt;code &gt;pg_stat_io&lt;/code&gt; can clearly show this by tracking &lt;code &gt;evictions&lt;/code&gt; for each backend type across the system. Further, if you see a sudden spike in evictions, and then suddenly a lot of &lt;code &gt;reads&lt;/code&gt;, it can help you infer that the cached data that was evicted, was actually needed again shortly afterwards. If in doubt, you can use the &lt;code &gt;pg_buffercache&lt;/code&gt; extension to look at the current shared buffers contents in detail.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&quot;tracking-cumulative-io-activity-by-autovacuum-and-manual-vacuums&quot; &gt;&lt;a href=&quot;#tracking-cumulative-io-activity-by-autovacuum-and-manual-vacuums&quot; aria-label=&quot;tracking cumulative io activity by autovacuum and manual vacuums permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Tracking cumulative I/O activity by autovacuum and manual VACUUMs&lt;/h3&gt;
&lt;p&gt;It&apos;s a fact that every Postgres server needs the occasional VACUUM - whether you schedule it manually, or have autovacuum take care of it for you. It helps clean up dead rows and makes space re-usable, and it freezes pages to prevent transaction ID wraparound.&lt;/p&gt;
&lt;p&gt;But there is such a thing as VACUUMing too often. If not tuned correctly, VACUUM and autovacuum can have a dramatic effect on I/O activity. Historically the best bet was to look at the output of &lt;code &gt;log_autovacuum_min_duration&lt;/code&gt;, which will give you information like this:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;  LOG:  automatic vacuum of table &quot;mydb.pg_toast.pg_toast_42593&quot;: index scans: 0
        pages: 0 removed, 13594 remain, 13594 scanned (100.00% of total)
        tuples: 0 removed, 54515 remain, 0 are dead but not yet removable
        removable cutoff: 11915, which was 6 XIDs old when operation ended
        new relfrozenxid: 11915, which is 4139 XIDs ahead of previous value
        frozen: 13594 pages from table (100.00% of total) had 54515 tuples frozen
        index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
        avg read rate: 0.113 MB/s, avg write rate: 0.113 MB/s
        buffer usage: 13614 hits, 13602 misses, 13600 dirtied
        WAL usage: 40786 records, 13600 full page images, 113072608 bytes
        system usage: CPU: user: 0.26 s, system: 0.52 s, elapsed: 939.84 s&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;From the &lt;code &gt;buffer usage&lt;/code&gt; you can determine that this single VACUUM had to read 13602 pages, and marked 13600 pages as dirty. But what if we want to get a more complete picture, and across all our VACUUMs?&lt;/p&gt;
&lt;p&gt;With &lt;code &gt;pg_stat_io&lt;/code&gt;, you can now see a system-wide measurement of the impact of VACUUM, by looking at everything marked as &lt;code &gt;io_context = &apos;vacuum&apos;&lt;/code&gt;, or associated to the &lt;code &gt;autovacuum worker&lt;/code&gt; backend type:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stat_io &lt;span &gt;WHERE&lt;/span&gt; backend_type &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;autovacuum worker&apos;&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;io_context &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;vacuum&apos;&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;reads&lt;/span&gt; &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; writes &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; extends &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;    backend_type    | io_object | io_context |  reads  | writes  | extends | op_bytes | evictions | reuses  | fsyncs |          stats_reset          
--------------------+-----------+------------+---------+---------+---------+----------+-----------+---------+--------+-------------------------------
 autovacuum worker  | relation  | bulkread   |       0 |       0 |         |     8192 |         0 |       0 |        | 2023-02-13 11:50:27.583875-08
 autovacuum worker  | relation  | normal     |   16306 |    2494 |    2915 |     8192 |     17785 |         |      0 | 2023-02-13 11:50:27.583875-08
 autovacuum worker  | relation  | vacuum     | 5824251 | 3028684 |       0 |     8192 |      2588 | 5821460 |        | 2023-02-13 11:50:27.583875-08
 client backend     | relation  | vacuum     |  128710 |       0 |       0 |     8192 |      1221 |  127489 |        | 2023-02-13 11:50:27.583875-08
 standalone backend | relation  | vacuum     |      10 |       0 |       0 |     8192 |         0 |       0 |        | 2023-02-13 11:50:27.583875-08
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this particular example, in sum, the autovacuum worker has read 44.4 GB of data (5,824,251 buffer pages), and written 23.1GB (3,028,684 buffer pages).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If you track these statistics over time&lt;/strong&gt;, it will help you have a crystal-clear picture of whether autovacuum is to blame for an I/O spike during business hours. It will also help you make changes to tune autovacuum with more confidence, e.g. making autovacuum more aggressive to prevent bloat.&lt;/p&gt;
&lt;h3 id=&quot;visibility-into-bulk-readwrite-strategies-sequential-scans-and-copy&quot; &gt;&lt;a href=&quot;#visibility-into-bulk-readwrite-strategies-sequential-scans-and-copy&quot; aria-label=&quot;visibility into bulk readwrite strategies sequential scans and copy permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Visibility into bulk read/write strategies (sequential scans and COPY)&lt;/h3&gt;
&lt;p&gt;Have you ever used COPY in Postgres to load data? Or read data from a table using a sequential scan? You may not know that in most cases, this data does not pass through shared buffers in the regular way. Instead, Postgres uses a special dedicated ring buffer that ensures that most of shared buffers is undisturbed by such large activities.&lt;/p&gt;
&lt;p&gt;Before &lt;code &gt;pg_stat_io&lt;/code&gt;, it was near impossible to understand this activity in Postgres, as &lt;strong&gt;there was simply no tracking for it&lt;/strong&gt;. Now, we can finally see both bulk reads (typically large sequential scans) and bulk writes (typically COPY in), and the I/O activity they cause.&lt;/p&gt;
&lt;p&gt;You can simply filter for the new &lt;code &gt;bulkwrite&lt;/code&gt; and &lt;code &gt;bulkread&lt;/code&gt; values in &lt;code &gt;io_context&lt;/code&gt;, and have visibility into this activity:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stat_io &lt;span &gt;WHERE&lt;/span&gt; io_context &lt;span &gt;IN&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;bulkread&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&apos;bulkwrite&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;reads&lt;/span&gt; &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; writes &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;OR&lt;/span&gt; extends &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;    backend_type    | io_object | io_context |  reads   | writes  | extends | op_bytes | evictions |  reuses  | fsyncs |          stats_reset          
--------------------+-----------+------------+----------+---------+---------+----------+-----------+----------+--------+-------------------------------
 client backend     | relation  | bulkread   | 25900458 |  627059 |         |     8192 |    754610 | 25141667 |        | 2023-02-13 11:50:27.583875-08
 client backend     | relation  | bulkwrite  |     4654 | 2858085 | 3259572 |     8192 |    998220 |  2209070 |        | 2023-02-13 11:50:27.583875-08
 background worker  | relation  | bulkread   | 39059938 |  590896 |         |     8192 |    802939 | 38253662 |        | 2023-02-13 11:50:27.583875-08
 standalone backend | relation  | bulkwrite  |        0 |       0 |       8 |     8192 |         0 |        0 |        | 2023-02-13 11:50:27.583875-08
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this example, there is 495 GB of bulk read activity, and 21 GB of bulk write activity we had no good way of identifying before. However, and most importantly, we don&apos;t have to worry about the &lt;code &gt;evictions&lt;/code&gt; count here - these are all evictions from the special bulk read / bulk write ring buffer, not from regular shared buffers.&lt;/p&gt;
&lt;h2 id=&quot;sneak-peek-visualizing-pg_stat_io-in-pganalyze&quot; &gt;&lt;a href=&quot;#sneak-peek-visualizing-pg_stat_io-in-pganalyze&quot; aria-label=&quot;sneak peek visualizing pg_stat_io in pganalyze permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Sneak peek: Visualizing pg_stat_io in pganalyze&lt;/h2&gt;
&lt;p&gt;It&apos;s still a while until Postgres 16 will be released (usually September or October each year), but to help test things (and because it&apos;s exciting!) &lt;strong&gt;I took a quick stab at updating pganalyze in an experimental branch&lt;/strong&gt; to collect &lt;code &gt;pg_stat_io&lt;/code&gt; metrics and visualize them over time.&lt;/p&gt;
&lt;p&gt;Here is a very early look at how this may look like in the future:&lt;/p&gt;
&lt;figure&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/9643f0e4baa16706490fda146c7a3791/56fb6/pganalyze_pg_stat_io.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Screenshot of experimental pg_stat_io view in pganalyze&quot; title=&quot;Screenshot of experimental pg_stat_io view in pganalyze&quot; src=&quot;https://pganalyze.com/static/9643f0e4baa16706490fda146c7a3791/1d69c/pganalyze_pg_stat_io.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;figcaption&gt;Experimental view of how pg_stat_io could look like when visualized over time&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Even though this is just running locally on my laptop, already we can see a clear pattern where writes are done by the checkpointer and background writer processes, most of the time. We can also see my &lt;code &gt;checkpoint_timeout&lt;/code&gt; being set to &lt;code &gt;5min&lt;/code&gt; (the default), with both &lt;strong&gt;writes and fsyncs happening like clockwork&lt;/strong&gt; - note the workload is periodic every 10 minutes, so every second checkpoint has less work to do.&lt;/p&gt;
&lt;p&gt;However, we can also clearly see a spike in activity - and that spike can be easily explained: To generate more database activity, I triggered a big daily background process around 8:10pm UTC. The high amount of data read caused the working set to momentarily exceed shared buffers, and caused a large amount of buffer evictions, which then caused &lt;strong&gt;the client backend having to write out buffer pages unexpectedly&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;On this system I have a very small &lt;code &gt;shared_buffers&lt;/code&gt; setting (the default, 128 MB). I should probably increase shared_buffers...&lt;/p&gt;
&lt;h2 id=&quot;the-future-of-io-observability-in-postgres&quot; &gt;&lt;a href=&quot;#the-future-of-io-observability-in-postgres&quot; aria-label=&quot;the future of io observability in postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The future of I/O observability in Postgres&lt;/h2&gt;
&lt;p&gt;A lot of the ground work for &lt;code &gt;pg_stat_io&lt;/code&gt; actually happened previously in Postgres 15, through the new cumulative statistics system using shared memory.&lt;/p&gt;
&lt;p&gt;Before Postgres 15, statistics tracking had to go through the statistics collector (an obscure process that received UDP packets from individual processes part of Postgres), which was slow and error prone. This historically limited the ability to collect more advanced statistics easily. As the addition of &lt;code &gt;pg_stat_io&lt;/code&gt; shows, it is now much easier to track additional information about how Postgres operates.&lt;/p&gt;
&lt;p&gt;Amongst the immediate improvements that are already being discussed are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Tracking of system-wide buffer cache hits (to allow calculating an accurate buffer cache hit ratio)&lt;/li&gt;
&lt;li&gt;Cumulative system-wide I/O times (not just I/O counts as currently present in &lt;code &gt;pg_stat_io&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Better cumulative WAL statistics (i.e. going beyond what pg_stat_wal offers)&lt;/li&gt;
&lt;li&gt;Additional I/O tracking for tables and indexes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Our team at pganalyze is excited to have helped shape the new &lt;code &gt;pg_stat_io&lt;/code&gt; view, and we look forward to continue working with the community on making Postgres better.&lt;/p&gt;
&lt;p&gt;Share this article: If you&apos;d like to share this article with your peers, you can &lt;a href=&quot;(https://twitter.com/intent/tweet?text=Waiting%20for%20Postgres%2016:%20Cumulative%20I/O%20statistics%20with%20pg_stat_io%20-%20Check%20out%20this%20article%20by%20%40pganalyze%20%20and%20learn%20about%20querying%20system%20wide%20I/O%20statistics%20in%20Postgres%3A%20https%3A%2F%2Fpganalyze.com%2Fblog%2Fpg-stat-io)&quot;&gt;tweet about it here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PS:&lt;/strong&gt; If you&apos;re interested in learning more about optimizing Postgres I/O performance and costs you can &lt;a href=&quot;https://pganalyze.com/webinars/optimizing-postgres-io-performance-and-costs&quot;&gt;check out our webinar recording&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[How Postgres Chooses Which Index To Use For A Query]]></title><description><![CDATA[Using Postgres sometimes feels like magic. But sometimes the magic is too much, such as when you are trying to understand the reason behind a seemingly bad Postgres query plan. I've often times found myself in a situation where I asked myself: "Postgres, what are you thinking?". Staring at an EXPLAIN plan, seeing a , and being puzzled as to why Postgres isn't doing what I am expecting. This has led me down the path of reading the Postgres source, in search for answers. Why is Postgres choosing a…]]></description><link>https://pganalyze.com/blog/how-postgres-chooses-index</link><guid isPermaLink="false">https://pganalyze.com/blog/how-postgres-chooses-index</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Fri, 01 Apr 2022 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Index usage of a Hash Join compared to a Nested Loop Join&quot; title=&quot;Index usage of a Hash Join compared to a Nested Loop Join&quot; src=&quot;https://pganalyze.com/static/d92dffc7749fada6f63b9488297dee5b/1d69c/nested_loop_vs_hash_join.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Using Postgres sometimes feels like magic. But sometimes the magic is too much, such as when you are trying to understand the reason behind a seemingly bad Postgres query plan.&lt;/p&gt;
&lt;p&gt;I&apos;ve often times found myself in a situation where I asked myself: &lt;strong&gt;&quot;Postgres, what are you thinking?&quot;&lt;/strong&gt;. Staring at an EXPLAIN plan, seeing a &lt;code &gt;Sequential Scan&lt;/code&gt;, and being puzzled as to why Postgres isn&apos;t doing what I am expecting.&lt;/p&gt;
&lt;p&gt;This has led me down the path of reading the Postgres source, in search for answers. Why is Postgres choosing a particular index over another one, or not choosing an index altogether?&lt;/p&gt;
&lt;p&gt;In this blog post I aim to give an introduction to how the Postgres planner analyzes your query, and how it decides which indexes to use. Additionally, &lt;strong&gt;we’ll look at a puzzling situation&lt;/strong&gt; where the join type can impact which indexes are being used.&lt;/p&gt;
&lt;p&gt;We’ll look at a lot of Postgres source code, but if you are short on time, you might want to jump to &lt;a href=&quot;#understanding-b-tree-index-cost-estimates&quot;&gt;how B-tree index costing works&lt;/a&gt;, and &lt;a href=&quot;#parameterized-index-scans-or-why-nested-loop-are-sometimes-a-good-join-type&quot;&gt;why Nested Loop Joins impact index usage&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We’ll also talk about an &lt;a href=&quot;#new-features-coming-soon-to-pganalyze&quot;&gt;upcoming pganalyze feature&lt;/a&gt; at the very end!&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#a-tour-of-postgres-parse-analysis-and-early-stages-of-planning&quot;&gt;A tour of Postgres: Parse analysis and early stages of planning&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#four-levels-of-planning-a-query&quot;&gt;Four levels of planning a query&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#breaking-down-a-query-into-tables-being-scanned-reloptinfo-and-restrictinfo-structs&quot;&gt;Breaking down a query into tables being scanned (RelOptInfo and RestrictInfo structs)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#choosing-different-paths-and-scan-methods&quot;&gt;Choosing different paths and scan methods&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#where-index-scans-are-made&quot;&gt;Where Index Scans are made&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#creating-the-two-types-of-index-scans-plain-vs-parameterized&quot;&gt;Creating the two types of index scans: plain vs parameterized&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#understanding-b-tree-index-cost-estimates&quot;&gt;Understanding B-tree index cost estimates&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#parameterized-index-scans-or-why-nested-loop-are-sometimes-a-good-join-type&quot;&gt;Parameterized Index Scans, or: Why Nested Loop are sometimes a good join type&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#new-features-coming-soon-to-pganalyze&quot;&gt;New features coming soon to pganalyze&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#other--helpful-resources&quot;&gt;Other  helpful resources&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;a-tour-of-postgres-parse-analysis-and-early-stages-of-planning&quot; &gt;&lt;a href=&quot;#a-tour-of-postgres-parse-analysis-and-early-stages-of-planning&quot; aria-label=&quot;a tour of postgres parse analysis and early stages of planning permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;A tour of Postgres: Parse analysis and early stages of planning&lt;/h2&gt;
&lt;p&gt;To start with, let’s look at a query’s lifecycle in Postgres. There are four important steps in how a query is handled:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Parsing: Turning query text into an Abstract Syntax Tree (AST)&lt;/li&gt;
&lt;li&gt;Parse analysis: Turning table names into actual references to table objects&lt;/li&gt;
&lt;li&gt;Planning: Finding and creating the optimal query plan&lt;/li&gt;
&lt;li&gt;Execution: Executing the query plan&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For understanding how the planner chooses which indexes to use, let’s first take a look at what parse analysis does.&lt;/p&gt;
&lt;p&gt;Whilst there are multiple entry points into parse analysis, depending if you have query parameters or not, the core function in parse analysis is &lt;code &gt;transformStmt&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/d22646922d66012705e0e2948cfb5b4a07092a29/src/backend/parser/analyze.c#L313&quot;&gt;source&lt;/a&gt;):&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
* transformStmt -
*    recursively transform a Parse tree into a Query tree.
*/&lt;/span&gt;
Query &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;transformStmt&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;ParseState &lt;span &gt;*&lt;/span&gt;pstate&lt;span &gt;,&lt;/span&gt; Node &lt;span &gt;*&lt;/span&gt;parseTree&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This takes the raw parse tree output (from the first step), and returns a Query struct. It has a lot of specific cases, as it handles both regular SELECTs as well as UPDATEs and other DML statements. Note that utility statements (DDL, etc) mostly get passed through to the execution phase.&lt;/p&gt;
&lt;p&gt;Since we are interested in tables and indexes, let’s take a closer look at how parse analysis handles the FROM clause:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;void&lt;/span&gt;
&lt;span &gt;transformFromClause&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;ParseState &lt;span &gt;*&lt;/span&gt;pstate&lt;span &gt;,&lt;/span&gt; List &lt;span &gt;*&lt;/span&gt;frmList&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
   ListCell   &lt;span &gt;*&lt;/span&gt;fl&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;/*
    * The grammar will have produced a list of RangeVars, RangeSubselects,
    * RangeFunctions, and/or JoinExprs. Transform each one (possibly adding
    * entries to the rtable), check for duplicate refnames, and then add it
    * to the joinlist and namespace.
    */&lt;/span&gt;
   &lt;span &gt;foreach&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;fl&lt;span &gt;,&lt;/span&gt; frmList&lt;span &gt;)&lt;/span&gt;
   &lt;span &gt;{&lt;/span&gt;
       …
 
       n &lt;span &gt;=&lt;/span&gt; &lt;span &gt;transformFromClauseItem&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;pstate&lt;span &gt;,&lt;/span&gt; n&lt;span &gt;,&lt;/span&gt;
                                   &lt;span &gt;&amp;amp;&lt;/span&gt;nsitem&lt;span &gt;,&lt;/span&gt;
                                   &lt;span &gt;&amp;amp;&lt;/span&gt;namespace&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
…
&lt;span &gt;/*
* transformFromClauseItem -
*    Transform a FROM-clause item, adding any required entries to the
*    range table list being built in the ParseState, and return the
*    transformed item ready to include in the joinlist.  Also build a
*    ParseNamespaceItem list describing the names exposed by this item.
*    This routine can recurse to handle SQL92 JOIN expressions.
*/&lt;/span&gt;
&lt;span &gt;static&lt;/span&gt; Node &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;transformFromClauseItem&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;ParseState &lt;span &gt;*&lt;/span&gt;pstate&lt;span &gt;,&lt;/span&gt; Node &lt;span &gt;*&lt;/span&gt;n&lt;span &gt;,&lt;/span&gt;
                       ParseNamespaceItem &lt;span &gt;*&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;top_nsitem&lt;span &gt;,&lt;/span&gt;
                       List &lt;span &gt;*&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;namespace&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Postgres already separates between the range table list (essentially a list of all the tables referenced by the query), and the joinlist. This distinction will also be visible at a later point in the planner.&lt;/p&gt;
&lt;p&gt;Note that at this point Postgres has not yet made up its mind which indexes to use - it just decided that the FROM reference you called “foobar” is actually the table “foobar” in the “public” schema with OID 16424.&lt;/p&gt;
&lt;p&gt;This information now gets stored in the Query struct, which is the result of the parse analysis phase. This Query struct is then passed into the planner, and that’s where it gets interesting.&lt;/p&gt;
&lt;h3 id=&quot;four-levels-of-planning-a-query&quot; &gt;&lt;a href=&quot;#four-levels-of-planning-a-query&quot; aria-label=&quot;four levels of planning a query permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Four levels of planning a query&lt;/h3&gt;
&lt;p&gt;Commonly we would start with the &lt;code &gt;standard_planner&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/db0d67db2401eb6238ccc04c6407a4fd4f985832/src/backend/optimizer/plan/planner.c#L282&quot;&gt;source&lt;/a&gt;) function as an entry point into the Postgres planner:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;PlannedStmt &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;standard_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;Query &lt;span &gt;*&lt;/span&gt;parse&lt;span &gt;,&lt;/span&gt; &lt;span &gt;const&lt;/span&gt; &lt;span &gt;char&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;query_string&lt;span &gt;,&lt;/span&gt; &lt;span &gt;int&lt;/span&gt; cursorOptions&lt;span &gt;,&lt;/span&gt;
                ParamListInfo boundParams&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This takes our &lt;code &gt;Query&lt;/code&gt; struct, and ultimately returns a &lt;code &gt;PlannedStmt&lt;/code&gt;. For reference, the &lt;code &gt;PlannedStmt&lt;/code&gt; struct (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/include/nodes/plannodes.h#L43&quot;&gt;source&lt;/a&gt;) looks like this:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/* ----------------
*      PlannedStmt node
*
* The output of the planner is a Plan tree headed by a PlannedStmt node.
* PlannedStmt holds the &quot;one time&quot; information needed by the executor.
* ----------------
*/&lt;/span&gt;
&lt;span &gt;typedef&lt;/span&gt; &lt;span &gt;struct&lt;/span&gt; &lt;span &gt;PlannedStmt&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
   NodeTag     type&lt;span &gt;;&lt;/span&gt;
 
   CmdType     commandType&lt;span &gt;;&lt;/span&gt;    &lt;span &gt;/* select|insert|update|delete|utility */&lt;/span&gt;
 
…
 
   &lt;span &gt;struct&lt;/span&gt; &lt;span &gt;Plan&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;planTree&lt;span &gt;;&lt;/span&gt;      &lt;span &gt;/* tree of Plan nodes */&lt;/span&gt;
 
…&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The tree of plan nodes is what you would be familiar with if you’ve looked at an EXPLAIN output before - ultimately EXPLAIN is based on walking that plan tree and showing you a text/JSON/etc version of it.&lt;/p&gt;
&lt;p&gt;The core function of the planner is best described in these lines of &lt;code &gt;standard_planner&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/* primary planning entry point (may recurse for subqueries) */&lt;/span&gt;
root &lt;span &gt;=&lt;/span&gt; &lt;span &gt;subquery_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;glob&lt;span &gt;,&lt;/span&gt; parse&lt;span &gt;,&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                        false&lt;span &gt;,&lt;/span&gt; tuple_fraction&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;/* Select best Path and turn it into a Plan */&lt;/span&gt;
final_rel &lt;span &gt;=&lt;/span&gt; &lt;span &gt;fetch_upper_rel&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; UPPERREL_FINAL&lt;span &gt;,&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
best_path &lt;span &gt;=&lt;/span&gt; &lt;span &gt;get_cheapest_fractional_path&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;final_rel&lt;span &gt;,&lt;/span&gt; tuple_fraction&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

top_plan &lt;span &gt;=&lt;/span&gt; &lt;span &gt;create_plan&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; best_path&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The planner first creates what are called “paths” using the &lt;code &gt;subquery_planner&lt;/code&gt; (which may recursively call itself), and then the planner picks the best path. Best on this best path, the actual plan tree is constructed.&lt;/p&gt;
&lt;p&gt;For understanding how the planner chose which indexes to use, we must therefore look at paths, not at plan nodes. Let’s see what &lt;code &gt;subquery_planner&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/plan/planner.c#L596&quot;&gt;source&lt;/a&gt;) does:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*--------------------
* subquery_planner
*    Invokes the planner on a subquery.  We recurse to here for each
*    sub-SELECT found in the query tree.
…
*/&lt;/span&gt;
PlannerInfo &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;subquery_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerGlobal &lt;span &gt;*&lt;/span&gt;glob&lt;span &gt;,&lt;/span&gt; Query &lt;span &gt;*&lt;/span&gt;parse&lt;span &gt;,&lt;/span&gt;
                PlannerInfo &lt;span &gt;*&lt;/span&gt;parent_root&lt;span &gt;,&lt;/span&gt;
                bool hasRecursion&lt;span &gt;,&lt;/span&gt; &lt;span &gt;double&lt;/span&gt; tuple_fraction&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As described in the comment, this handles each sub-SELECT separately - but note that even if the original query contains a written sub-SELECT, the planner may optimize it away to pull it up into the parent planning process, if possible.&lt;/p&gt;
&lt;p&gt;For the purposes of focusing on index choice, here are the two key parts of &lt;code &gt;subquery_planner&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * Do the main planning.  If we have an inherited target relation, that
 * needs special processing, else go straight to grouping_planner.
 */&lt;/span&gt;
&lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;parse&lt;span &gt;-&gt;&lt;/span&gt;resultRelation &lt;span &gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span &gt;rt_fetch&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;parse&lt;span &gt;-&gt;&lt;/span&gt;resultRelation&lt;span &gt;,&lt;/span&gt; parse&lt;span &gt;-&gt;&lt;/span&gt;rtable&lt;span &gt;)&lt;/span&gt;&lt;span &gt;-&gt;&lt;/span&gt;inh&lt;span &gt;)&lt;/span&gt;
    &lt;span &gt;inheritance_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;else&lt;/span&gt;
    &lt;span &gt;grouping_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; false&lt;span &gt;,&lt;/span&gt; tuple_fraction&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

…

&lt;span &gt;/*
 * Make sure we&apos;ve identified the cheapest Path for the final rel.  (By
 * doing this here not in grouping_planner, we include initPlan costs in
 * the decision, though it&apos;s unlikely that will change anything.)
 */&lt;/span&gt;
&lt;span &gt;set_cheapest&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;final_rel&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This method also optimizes for the cheapest path - we’ll see more on that in a moment. But for now, let’s go deeper down the rabbit hole and look at &lt;code &gt;grouping_planner&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/plan/planner.c#L1253&quot;&gt;source&lt;/a&gt;):&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/* --------------------
 * grouping_planner
 *    Perform planning steps related to grouping, aggregation, etc.
 *
 * This function adds all required top-level processing to the scan/join
 * Path(s) produced by query_planner.
 *
 * --------------------
 */&lt;/span&gt;
&lt;span &gt;static&lt;/span&gt; &lt;span &gt;void&lt;/span&gt;
&lt;span &gt;grouping_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; bool inheritance_update&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;double&lt;/span&gt; tuple_fraction&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Reading through its code, turns out we’re still not there. It’s actually &lt;code &gt;query_planner&lt;/code&gt; that we are looking for, as described in this comment:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;RelOptInfo &lt;span &gt;*&lt;/span&gt;current_rel&lt;span &gt;;&lt;/span&gt;
…
&lt;span &gt;/*
* Generate the best unsorted and presorted paths for the scan/join
* portion of this Query, ie the processing represented by the
* FROM/WHERE clauses.  (Note there may not be any presorted paths.)
* We also generate (in standard_qp_callback) pathkey representations
* of the query&apos;s sort clause, distinct clause, etc.
*/&lt;/span&gt;
current_rel &lt;span &gt;=&lt;/span&gt; &lt;span &gt;query_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; standard_qp_callback&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;qp_extra&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Before we dive into the &lt;code &gt;query_planner&lt;/code&gt; method, let’s pause for a moment and look at what the result of &lt;code &gt;query_planner&lt;/code&gt; is, the &lt;code &gt;RelOptInfo&lt;/code&gt; struct:&lt;/p&gt;
&lt;h3 id=&quot;breaking-down-a-query-into-tables-being-scanned-reloptinfo-and-restrictinfo-structs&quot; &gt;&lt;a href=&quot;#breaking-down-a-query-into-tables-being-scanned-reloptinfo-and-restrictinfo-structs&quot; aria-label=&quot;breaking down a query into tables being scanned reloptinfo and restrictinfo structs permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Breaking down a query into tables being scanned (RelOptInfo and RestrictInfo structs)&lt;/h3&gt;
&lt;p&gt;In the Postgres planner, &lt;code &gt;RelOptInfo&lt;/code&gt; is best described as the internal representation of a particular table that is being scanned (with either a sequential scan, or an index scan).&lt;/p&gt;
&lt;p&gt;When trying to understand how Postgres interprets your query, adding debug information that shows RelOptInfo would be the closest that you can get to seeing which tables Postgres is going to scan, and how it makes a decision between different scan methods, such as an Index Scan.&lt;/p&gt;
&lt;p&gt;RelOptInfo (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/include/nodes/pathnodes.h#L674&quot;&gt;source&lt;/a&gt;) has many details to it, but the key parts for our focus on indexing are these:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*----------
* RelOptInfo
*      Per-relation information for planning/optimization
…
*      pathlist - List of Path nodes, one for each potentially useful
*                 method of generating the relation
… 
*      baserestrictinfo - List of RestrictInfo nodes, containing info about
*                  each non-join qualification clause in which this relation
*                  participates (only used for base rels)
…
*      joininfo  - List of RestrictInfo nodes, containing info about each
*                  join clause in which this relation participates
…
*/&lt;/span&gt;
&lt;span &gt;typedef&lt;/span&gt; &lt;span &gt;struct&lt;/span&gt; &lt;span &gt;RelOptInfo&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   List       &lt;span &gt;*&lt;/span&gt;pathlist&lt;span &gt;;&lt;/span&gt;       &lt;span &gt;/* Path structures */&lt;/span&gt;
…
   List       &lt;span &gt;*&lt;/span&gt;baserestrictinfo&lt;span &gt;;&lt;/span&gt;   &lt;span &gt;/* RestrictInfo structures (if base rel) */&lt;/span&gt;
…
   List       &lt;span &gt;*&lt;/span&gt;joininfo&lt;span &gt;;&lt;/span&gt;       &lt;span &gt;/* RestrictInfo structures for join clauses
                                * involving this rel */&lt;/span&gt;
…
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Before we interpret this, let’s look at &lt;code &gt;RestrictInfo&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/include/nodes/pathnodes.h#L2067&quot;&gt;source&lt;/a&gt;):&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
* Restriction clause info.
*
* We create one of these for each AND sub-clause of a restriction condition
* (WHERE or JOIN/ON clause).  Since the restriction clauses are logically
* ANDed, we can use any one of them or any subset of them to filter out
* tuples, without having to evaluate the rest.
..
*/&lt;/span&gt;
&lt;span &gt;typedef&lt;/span&gt; &lt;span &gt;struct&lt;/span&gt; &lt;span &gt;RestrictInfo&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
   NodeTag     type&lt;span &gt;;&lt;/span&gt;
   Expr       &lt;span &gt;*&lt;/span&gt;clause&lt;span &gt;;&lt;/span&gt;         &lt;span &gt;/* the represented clause of WHERE or JOIN */&lt;/span&gt;
…
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A note on terminology: This references “base relations”, which are relations (aka tables) that are looked at solely on their individual basis, as compared to in the context of a JOIN.&lt;/p&gt;
&lt;p&gt;In the code sample, &lt;code &gt;RestrictInfo&lt;/code&gt; is how our WHERE clause and JOIN conditions get represented. This is the part that is key to understanding how Postgres compares your query against the indexes that exist.&lt;/p&gt;
&lt;p&gt;You can think about it this way - for each table that’s included in the query, Postgres generates two lists of “restriction” clauses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Base restriction clauses&lt;/strong&gt;: Typically part of your WHERE clause, and are expressions that involve only the table itself - for example &lt;code &gt;users.id = 123&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Join clauses&lt;/strong&gt;: Typically part of your JOIN clause, and expressions that involve multiple tables - for example &lt;code &gt;users.id = comments.user_id&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note the reason that Postgres calls these “restriction” clauses is because they restrict (or filter) the amount of data that is being returned from your table. &lt;strong&gt;And how can we effectively filter data from a table? By using an index!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The base restriction clauses will typically be used to filter down the amount of data that is being returned from the table. But join clauses oftentimes will not, as they are only used as part of the matching of rows that happens during the JOIN operation.&lt;/p&gt;
&lt;p&gt;The one exception to this are &lt;a src=&quot;https://pganalyze.com/docs/explain/join-nodes/nested-loop&quot;&gt;Nested Loop Joins&lt;/a&gt; - but we’ll come back to that.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Effective Indexing eBook promotion banner&quot; title=&quot;Effective Indexing eBook promotion banner&quot; src=&quot;https://pganalyze.com/static/b24fdd95dbc38757fe354c86d9ad9aaa/acb04/promo_ebook.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h3 id=&quot;choosing-different-paths-and-scan-methods&quot; &gt;&lt;a href=&quot;#choosing-different-paths-and-scan-methods&quot; aria-label=&quot;choosing different paths and scan methods permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Choosing different paths and scan methods&lt;/h3&gt;
&lt;p&gt;Let’s go back to &lt;code &gt;query_planner&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/plan/planmain.c#L55&quot;&gt;source&lt;/a&gt;), and what it does:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
* query_planner
*    Generate a path (that is, a simplified plan) for a basic query,
*    which may involve joins but not any fancier features.
*
* Since query_planner does not handle the toplevel processing (grouping,
* sorting, etc) it cannot select the best path by itself.  Instead, it
* returns the RelOptInfo for the top level of joining, and the caller
* (grouping_planner) can choose among the surviving paths for the rel.
…
*/&lt;/span&gt;
RelOptInfo &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;query_planner&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt;
             query_pathkeys_callback qp_callback&lt;span &gt;,&lt;/span&gt; &lt;span &gt;void&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;qp_extra&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   &lt;span &gt;/*
    * Construct RelOptInfo nodes for all base relations used in the query.
    */&lt;/span&gt;
   &lt;span &gt;add_base_rels_to_query&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;Node &lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; parse&lt;span &gt;-&gt;&lt;/span&gt;jointree&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
…
   &lt;span &gt;/*
    * Ready to do the primary planning.
    */&lt;/span&gt;
   final_rel &lt;span &gt;=&lt;/span&gt; &lt;span &gt;make_one_rel&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; joinlist&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;return&lt;/span&gt; final_rel&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The main point of &lt;code &gt;query_planner&lt;/code&gt; itself is to create a set of &lt;code &gt;RelOptInfo&lt;/code&gt; nodes, do a bunch of processing involving them, and then passing them to &lt;code &gt;make_one_rel&lt;/code&gt;. As that name says, it creates one “final rel”, which is also a &lt;code &gt;RelOptInfo&lt;/code&gt; node, that is then used to create our final plan.&lt;/p&gt;
&lt;p&gt;We’ve looked at a bunch of code already, but now it’s time to get to the exciting part!&lt;/p&gt;
&lt;p&gt;The implementation of &lt;code &gt;make_one_rel&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/path/allpaths.c#L153&quot;&gt;source&lt;/a&gt;) sits in a file with the important sounding name of &lt;code &gt;allpaths.c&lt;/code&gt; - and as referenced earlier, when we talk about plan choices, we need to understand which path is chosen, as that is used to then create a plan node.&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * make_one_rel
 *    Finds all possible access paths for executing a query, returning a
 *    single rel that represents the join of all base rels in the query.
 */&lt;/span&gt;
RelOptInfo &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;make_one_rel&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; List &lt;span &gt;*&lt;/span&gt;joinlist&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   &lt;span &gt;/*
    * Compute size estimates and consider_parallel flags for each base rel.
    */&lt;/span&gt;
   &lt;span &gt;set_base_rel_sizes&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
…
 
   &lt;span &gt;/*
    * Generate access paths for each base rel.
    */&lt;/span&gt;
   &lt;span &gt;set_base_rel_pathlists&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
   &lt;span &gt;/*
    * Generate access paths for the entire join tree.
    */&lt;/span&gt;
   rel &lt;span &gt;=&lt;/span&gt; &lt;span &gt;make_rel_from_joinlist&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; joinlist&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;return&lt;/span&gt; rel&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Paths are chosen in three steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Estimate the sizes of the involved tables&lt;/li&gt;
&lt;li&gt;Find the best path for each base relation&lt;/li&gt;
&lt;li&gt;Find the best path for the entire join tree&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The first step is mainly concerned with size estimates as they relate to the output of scanning the relation. This impacts the cost and rows numbers you are familiar with from EXPLAIN - and this may impact joins, but typically should not directly impact index usage.&lt;/p&gt;
&lt;p&gt;Now step 2 is key to our goal here. And &lt;code &gt;set_base_rel_pathlists&lt;/code&gt; ultimately calls &lt;code &gt;set_plain_rel_pathlist&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/path/allpaths.c#L767&quot;&gt;source&lt;/a&gt;), which finally looks like what we are interested in:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * set_plain_rel_pathlist
 *    Build access paths for a plain relation (no subquery, no inheritance)
 */&lt;/span&gt;
&lt;span &gt;static&lt;/span&gt; &lt;span &gt;void&lt;/span&gt;
&lt;span &gt;set_plain_rel_pathlist&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; RelOptInfo &lt;span &gt;*&lt;/span&gt;rel&lt;span &gt;,&lt;/span&gt; RangeTblEntry &lt;span &gt;*&lt;/span&gt;rte&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
   …
 
   &lt;span &gt;/* Consider sequential scan */&lt;/span&gt;
   &lt;span &gt;add_path&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;rel&lt;span &gt;,&lt;/span&gt; &lt;span &gt;create_seqscan_path&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;,&lt;/span&gt; required_outer&lt;span &gt;,&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;/* If appropriate, consider parallel sequential scan */&lt;/span&gt;
   &lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;rel&lt;span &gt;-&gt;&lt;/span&gt;consider_parallel &lt;span &gt;&amp;amp;&amp;amp;&lt;/span&gt; required_outer &lt;span &gt;==&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
       &lt;span &gt;create_plain_partial_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;/* Consider index scans */&lt;/span&gt;
   &lt;span &gt;create_index_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   &lt;span &gt;/* Consider TID scans */&lt;/span&gt;
   &lt;span &gt;create_tidscan_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;where-index-scans-are-made&quot; &gt;&lt;a href=&quot;#where-index-scans-are-made&quot; aria-label=&quot;where index scans are made permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Where Index Scans are made&lt;/h2&gt;
&lt;h3 id=&quot;creating-the-two-types-of-index-scans-plain-vs-parameterized&quot; &gt;&lt;a href=&quot;#creating-the-two-types-of-index-scans-plain-vs-parameterized&quot; aria-label=&quot;creating the two types of index scans plain vs parameterized permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Creating the two types of index scans: plain vs parameterized&lt;/h3&gt;
&lt;p&gt;Let’s look at &lt;code &gt;create_index_paths&lt;/code&gt; (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/path/indxpath.c#L235&quot;&gt;source&lt;/a&gt;), since we want to see how indexes are picked:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
* create_index_paths()
*    Generate all interesting index paths for the given relation.
*    Candidate paths are added to the rel&apos;s pathlist (using add_path).
*
* To be considered for an index scan, an index must match one or more
* restriction clauses or join clauses from the query&apos;s qual condition,
* or match the query&apos;s ORDER BY condition, or have a predicate that
* matches the query&apos;s qual condition.
*
* There are two basic kinds of index scans.  A &quot;plain&quot; index scan uses
* only restriction clauses (possibly none at all) in its indexqual,
* so it can be applied in any context.  A &quot;parameterized&quot; index scan uses
* join clauses (plus restriction clauses, if available) in its indexqual.
* When joining such a scan to one of the relations supplying the other
* variables used in its indexqual, the parameterized scan must appear as
* the inner relation of a nestloop join; it can&apos;t be used on the outer side,
* nor in a merge or hash join.
…
*/&lt;/span&gt;
&lt;span &gt;void&lt;/span&gt;
&lt;span &gt;create_index_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; RelOptInfo &lt;span &gt;*&lt;/span&gt;rel&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   &lt;span &gt;/* Examine each index in turn */&lt;/span&gt;
   &lt;span &gt;foreach&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;lc&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;-&gt;&lt;/span&gt;indexlist&lt;span &gt;)&lt;/span&gt;
   &lt;span &gt;{&lt;/span&gt;
       IndexOptInfo &lt;span &gt;*&lt;/span&gt;index &lt;span &gt;=&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;IndexOptInfo &lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;lfirst&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;lc&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
       …
 
       &lt;span &gt;/*
        * Ignore partial indexes that do not match the query.
        */&lt;/span&gt;
       &lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;index&lt;span &gt;-&gt;&lt;/span&gt;indpred &lt;span &gt;!=&lt;/span&gt; NIL &lt;span &gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span &gt;!&lt;/span&gt;index&lt;span &gt;-&gt;&lt;/span&gt;predOK&lt;span &gt;)&lt;/span&gt;
           &lt;span &gt;continue&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
       &lt;span &gt;/*
        * Identify the restriction clauses that can match the index.
        */&lt;/span&gt;
       &lt;span &gt;match_restriction_clauses_to_index&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; index&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;rclauseset&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
       &lt;span &gt;/*
        * Build index paths from the restriction clauses.  These will be
        * non-parameterized paths.  Plain paths go directly to add_path(),
        * bitmap paths are added to bitindexpaths to be handled below.
        */&lt;/span&gt;
       &lt;span &gt;get_index_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;,&lt;/span&gt; index&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;rclauseset&lt;span &gt;,&lt;/span&gt;
                       &lt;span &gt;&amp;amp;&lt;/span&gt;bitindexpaths&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
       &lt;span &gt;/*
        * Identify the join clauses that can match the index.  For the moment
        * we keep them separate from the restriction clauses.
        */&lt;/span&gt;
       &lt;span &gt;match_join_clauses_to_index&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;,&lt;/span&gt; index&lt;span &gt;,&lt;/span&gt;
                                   &lt;span &gt;&amp;amp;&lt;/span&gt;jclauseset&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;joinorclauses&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
…
       &lt;span &gt;/*
        * If we found any plain or eclass join clauses, build parameterized
        * index paths using them.
        */&lt;/span&gt;
       &lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;jclauseset&lt;span &gt;.&lt;/span&gt;nonempty &lt;span &gt;||&lt;/span&gt; eclauseset&lt;span &gt;.&lt;/span&gt;nonempty&lt;span &gt;)&lt;/span&gt;
           &lt;span &gt;consider_index_join_clauses&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;,&lt;/span&gt; index&lt;span &gt;,&lt;/span&gt;
                                       &lt;span &gt;&amp;amp;&lt;/span&gt;rclauseset&lt;span &gt;,&lt;/span&gt;
                                       &lt;span &gt;&amp;amp;&lt;/span&gt;jclauseset&lt;span &gt;,&lt;/span&gt;
                                       &lt;span &gt;&amp;amp;&lt;/span&gt;eclauseset&lt;span &gt;,&lt;/span&gt;
                                       &lt;span &gt;&amp;amp;&lt;/span&gt;bitjoinpaths&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
   &lt;span &gt;}&lt;/span&gt;
 
…
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There are a lot of things to take in here - and we’ve already removed BitmapOr/BitmapAnd index scans from this code sample.&lt;/p&gt;
&lt;p&gt;First of all, &lt;strong&gt;this builds two types of index scans&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Plain index scans&lt;/strong&gt;, that only use the base restriction clauses&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Parameterized index scans&lt;/strong&gt;, that use both base restriction clauses and join clauses&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We’ll talk more about the second case in a moment.&lt;/p&gt;
&lt;p&gt;Other key aspects to understand:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Partial indexes (i.e. those with an attached WHERE clause on the index definition) are matched against the set of restriction clauses and discarded here if they don’t match&lt;/li&gt;
&lt;li&gt;Each index is both considered for an Index Scan and Index Only Scan (through the “build_index_paths” method), as well as for a Bitmap Heap Scan / Bitmap Index Scan&lt;/li&gt;
&lt;li&gt;Each potential way of using an index gets a cost assigned - and this cost decides whether Postgres actually chooses the index (see earlier notion of the “best path”), or not&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For understanding how costing works, you can look at the &lt;code &gt;cost_index&lt;/code&gt; function (&lt;a href=&quot;https://github.com/postgres/postgres/blob/9f91344223aad903ff70301f40183691a89f6cd4/src/backend/optimizer/path/costsize.c#L492&quot;&gt;source&lt;/a&gt;), which gets called from &lt;code &gt;build_index_paths&lt;/code&gt; through a few hoops.&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
* cost_index
*    Determines and returns the cost of scanning a relation using an index.
…
* In addition to rows, startup_cost and total_cost, cost_index() sets the
* path&apos;s indextotalcost and indexselectivity fields.  These values will be
* needed if the IndexPath is used in a BitmapIndexScan.
*/&lt;/span&gt;
&lt;span &gt;void&lt;/span&gt;
&lt;span &gt;cost_index&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;IndexPath &lt;span &gt;*&lt;/span&gt;path&lt;span &gt;,&lt;/span&gt; PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; &lt;span &gt;double&lt;/span&gt; loop_count&lt;span &gt;,&lt;/span&gt;
          bool partial_path&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   &lt;span &gt;/*
    * Call index-access-method-specific code to estimate the processing cost
    * for scanning the index, as well as the selectivity of the index (ie,
    * the fraction of main-table tuples we will have to retrieve) and its
    * correlation to the main-table tuple order.
    */&lt;/span&gt;
   &lt;span &gt;amcostestimate&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; path&lt;span &gt;,&lt;/span&gt; loop_count&lt;span &gt;,&lt;/span&gt;
                  &lt;span &gt;&amp;amp;&lt;/span&gt;indexStartupCost&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;indexTotalCost&lt;span &gt;,&lt;/span&gt;
                  &lt;span &gt;&amp;amp;&lt;/span&gt;indexSelectivity&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;indexCorrelation&lt;span &gt;,&lt;/span&gt;
                  &lt;span &gt;&amp;amp;&lt;/span&gt;index_pages&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Whilst there are other factors in costing an index scan, the main responsibility falls to the &lt;a href=&quot;https://www.postgresql.org/docs/current/indexam.html&quot;&gt;Index Access Method&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;understanding-b-tree-index-cost-estimates&quot; &gt;&lt;a href=&quot;#understanding-b-tree-index-cost-estimates&quot; aria-label=&quot;understanding b tree index cost estimates permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Understanding B-tree index cost estimates&lt;/h3&gt;
&lt;p&gt;The most common index access method (or index type) is B-tree, so let’s look at &lt;code &gt;btcostestimate&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;void&lt;/span&gt;
&lt;span &gt;btcostestimate&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; IndexPath &lt;span &gt;*&lt;/span&gt;path&lt;span &gt;,&lt;/span&gt; &lt;span &gt;double&lt;/span&gt; loop_count&lt;span &gt;,&lt;/span&gt;
              Cost &lt;span &gt;*&lt;/span&gt;indexStartupCost&lt;span &gt;,&lt;/span&gt; Cost &lt;span &gt;*&lt;/span&gt;indexTotalCost&lt;span &gt;,&lt;/span&gt;
              Selectivity &lt;span &gt;*&lt;/span&gt;indexSelectivity&lt;span &gt;,&lt;/span&gt; &lt;span &gt;double&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;indexCorrelation&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;double&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;indexPages&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
…
   &lt;span &gt;/*
    * For a btree scan, only leading &apos;=&apos; quals plus inequality quals for the
    * immediately next attribute contribute to index selectivity (these are
    * the &quot;boundary quals&quot; that determine the starting and stopping points of
    * the index scan).
    */&lt;/span&gt;
   indexBoundQuals &lt;span &gt;=&lt;/span&gt; …
 
   &lt;span &gt;/*
    * If the index is partial, AND the index predicate with the
    * index-bound quals to produce a more accurate idea of the number of
    * rows covered by the bound conditions.
    */&lt;/span&gt;
   selectivityQuals &lt;span &gt;=&lt;/span&gt; &lt;span &gt;add_predicate_to_index_quals&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;index&lt;span &gt;,&lt;/span&gt; indexBoundQuals&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
 
   btreeSelectivity &lt;span &gt;=&lt;/span&gt; &lt;span &gt;clauselist_selectivity&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; selectivityQuals&lt;span &gt;,&lt;/span&gt;
                                             index&lt;span &gt;-&gt;&lt;/span&gt;rel&lt;span &gt;-&gt;&lt;/span&gt;relid&lt;span &gt;,&lt;/span&gt;
                                             JOIN_INNER&lt;span &gt;,&lt;/span&gt;
                                             &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
   numIndexTuples &lt;span &gt;=&lt;/span&gt; btreeSelectivity &lt;span &gt;*&lt;/span&gt; index&lt;span &gt;-&gt;&lt;/span&gt;rel&lt;span &gt;-&gt;&lt;/span&gt;tuples&lt;span &gt;;&lt;/span&gt;
…
   costs&lt;span &gt;.&lt;/span&gt;numIndexTuples &lt;span &gt;=&lt;/span&gt; numIndexTuples&lt;span &gt;;&lt;/span&gt;
   &lt;span &gt;genericcostestimate&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; path&lt;span &gt;,&lt;/span&gt; loop_count&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&amp;amp;&lt;/span&gt;costs&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
…&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see a lot revolves around determining how many index tuples will be matched by the scan - as that’s the main expensive portion of querying a B-tree index.&lt;/p&gt;
&lt;p&gt;The first step is determining the boundaries of the index scan, as it relates to the data stored in the index. In particular this is relevant for multi-column B-tree indexes, where only a subset of the columns might match the query.&lt;/p&gt;
&lt;p&gt;You may have heard before about the best practice of ordering B-tree columns so the columns that are queried by an equality comparison (“=” operator) are put first, followed by one optional inequality comparison (“&amp;#x3C;&gt;” operator), followed by any other columns. This recommendation is based on the physical structure of the B-tree index, and the cost model also reflects this constraint.&lt;/p&gt;
&lt;p&gt;Put differently: The more specific you are with matching equality comparisons, the less parts of the index have to be scanned. This is represented here by the calculation of “btreeSelectivity”. If this number is small, the cost of the index scan will be less, as determined by “genericcostestimate” based on the estimated number of index tuples being scanned.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For creating the ideal B-tree index, you would:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Focus on indexing columns used in equality comparisons&lt;/li&gt;
&lt;li&gt;Index the columns with the best selectivity (i.e. being most specific), so that only a small portion of the index has to be scanned&lt;/li&gt;
&lt;li&gt;Involve a small number of columns (possibly only one), to keep the index size small - and thus reduce the total number of pages in the index&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you follow these steps, you will create a B-tree index that has a low cost, and that Postgres should choose.&lt;/p&gt;
&lt;p&gt;Now, there is one more thing we wanted to talk about, and that involves the notion of Parameterized Index Scans:&lt;/p&gt;
&lt;h3 id=&quot;parameterized-index-scans-or-why-nested-loop-are-sometimes-a-good-join-type&quot; &gt;&lt;a href=&quot;#parameterized-index-scans-or-why-nested-loop-are-sometimes-a-good-join-type&quot; aria-label=&quot;parameterized index scans or why nested loop are sometimes a good join type permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Parameterized Index Scans, or: Why Nested Loop are sometimes a good join type&lt;/h3&gt;
&lt;p&gt;As noted earlier, when Postgres looks at the potential index scans, it creates both plain index scans, and parameterized index scans.&lt;/p&gt;
&lt;p&gt;Plain index scans only involve parts of your query that involve the table itself, and would typically reference the clauses found in the WHERE clause.&lt;/p&gt;
&lt;p&gt;Parameterized index scans on the other hand involve the part of your query that references two different tables. Oftentimes you would find these clauses in the JOIN clause.&lt;/p&gt;
&lt;p&gt;Let’s take a look at a practical example. Assume the following schema and indexes:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; t1 &lt;span &gt;(&lt;/span&gt;
  id &lt;span &gt;bigint&lt;/span&gt; &lt;span &gt;PRIMARY&lt;/span&gt; &lt;span &gt;KEY&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  field &lt;span &gt;text&lt;/span&gt;
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; t2 &lt;span &gt;(&lt;/span&gt;
  id &lt;span &gt;bigint&lt;/span&gt; &lt;span &gt;PRIMARY&lt;/span&gt; &lt;span &gt;KEY&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  t1_id &lt;span &gt;bigint&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  other_field &lt;span &gt;text&lt;/span&gt;
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; t1_field_idx &lt;span &gt;ON&lt;/span&gt; t1&lt;span &gt;(&lt;/span&gt;field&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; t2_t1_id_idx &lt;span &gt;ON&lt;/span&gt; t2&lt;span &gt;(&lt;/span&gt;t1_id&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And this query:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;FROM&lt;/span&gt; t1
&lt;span &gt;JOIN&lt;/span&gt; t2 &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;t1&lt;span &gt;.&lt;/span&gt;id &lt;span &gt;=&lt;/span&gt; t2&lt;span &gt;.&lt;/span&gt;t1_id&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;WHERE&lt;/span&gt; t1&lt;span &gt;.&lt;/span&gt;field &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;123&apos;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We have two tables to scan - t1 and t2.&lt;/p&gt;
&lt;p&gt;For t1, we can utilize a plain index scan on the &lt;code &gt;t1_field_idx&lt;/code&gt; index - and that will perform well, since we have a specific value that is present in the query, that ideally matches a small amount of rows.&lt;/p&gt;
&lt;p&gt;When we run an EXPLAIN on the query, the simplest plan will look like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;FROM&lt;/span&gt; t1
&lt;span &gt;JOIN&lt;/span&gt; t2 &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;t1&lt;span &gt;.&lt;/span&gt;id &lt;span &gt;=&lt;/span&gt; t2&lt;span &gt;.&lt;/span&gt;t1_id&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;WHERE&lt;/span&gt; t1&lt;span &gt;.&lt;/span&gt;field &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;123&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                      QUERY PLAN                                       
---------------------------------------------------------------------------------------
 Hash Join  (cost=13.74..37.26 rows=5 width=88)
   Hash Cond: (t2.t1_id = t1.id)
   -&gt;  Seq Scan on t2  (cost=0.00..20.70 rows=1070 width=48)
   -&gt;  Hash  (cost=13.67..13.67 rows=6 width=40)
         -&gt;  Bitmap Heap Scan on t1  (cost=4.20..13.67 rows=6 width=40)
               Recheck Cond: (field = &apos;123&apos;::text)
               -&gt;  Bitmap Index Scan on t1_field_idx  (cost=0.00..4.20 rows=6 width=0)
                     Index Cond: (field = &apos;123&apos;::text)
(8 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or put visually:&lt;/p&gt;
&lt;p &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Index scans of an example Hash Join&quot; title=&quot;Index scans of an example Hash Join&quot; src=&quot;https://pganalyze.com/static/8889223c0f4a4b4b0c07c5f35bdf24eb/f8067/hash_join.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;As we can see Postgres uses a Sequential Scan on t2. Let’s add some more data into the tables, to see if that changes the plan:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; t1 &lt;span &gt;SELECT&lt;/span&gt; val&lt;span &gt;,&lt;/span&gt; val::&lt;span &gt;text&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; generate_series&lt;span &gt;(&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;1000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;AS&lt;/span&gt; x&lt;span &gt;(&lt;/span&gt;val&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; t2 &lt;span &gt;SELECT&lt;/span&gt; val&lt;span &gt;,&lt;/span&gt; val&lt;span &gt;,&lt;/span&gt; val::&lt;span &gt;text&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; generate_series&lt;span &gt;(&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;1000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;AS&lt;/span&gt; x&lt;span &gt;(&lt;/span&gt;val&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that we are effectively creating exactly one entry that matches the &lt;code &gt;t1.field = &apos;123&apos;&lt;/code&gt; condition, and we are also creating exactly one t2 entry for each t1 entry.&lt;/p&gt;
&lt;p&gt;If we re-run the EXPLAIN, we get the following plan:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                  QUERY PLAN                                  
------------------------------------------------------------------------------
 Nested Loop  (cost=0.55..16.60 rows=1 width=30)
   -&gt;  Index Scan using t1_field_idx on t1  (cost=0.28..8.29 rows=1 width=11)
         Index Cond: (field = &apos;123&apos;::text)
   -&gt;  Index Scan using t2_t1_id_idx on t2  (cost=0.28..8.29 rows=1 width=19)
         Index Cond: (t1_id = t1.id)
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Index scans of an example Nested Loop Join&quot; title=&quot;Index scans of an example Nested Loop Join&quot; src=&quot;https://pganalyze.com/static/cfc4b253e446e331010d8a28b593864b/50383/nested_loop_join.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;As you can see, we now get an index scan on &lt;code &gt;t2_t1_id_idx&lt;/code&gt;. This shows a Parameterized Index Scan in action - this is only possible because the join chosen by Postgres is a Nested Loop - not a Hash Join or Merge Join.&lt;/p&gt;
&lt;p&gt;A quick summary of how different join types impact index usage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Merge Join:&lt;/strong&gt; Needs sorted output from the scan node (thus can benefit from a sorted index like B-tree), but doesn&apos;t use the JOIN clause to restrict the data when scanning the table&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hash Join:&lt;/strong&gt; Doesn’t need sorted output, and doesn’t use the JOIN clause to restrict the data when scanning the table&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Nested Loop Join:&lt;/strong&gt; Doesn’t need sorted output from the scan node, but &lt;strong&gt;for one of the two tables&lt;/strong&gt; uses the JOIN clause to restrict the data when scanning the table&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Understanding what’s in your WHERE, your JOIN clause and your likely JOIN type is key, as all three will impact index usage.&lt;/p&gt;
&lt;p&gt;If you see a surprising Sequential Scan, you might want to review whether all possible index scans were parameterized index scans, and how the plan changes when you add an additional WHERE clause.&lt;/p&gt;
&lt;h2 id=&quot;new-features-coming-soon-to-pganalyze&quot; &gt;&lt;a href=&quot;#new-features-coming-soon-to-pganalyze&quot; aria-label=&quot;new features coming soon to pganalyze permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;New features coming soon to pganalyze&lt;/h2&gt;
&lt;p&gt;If you find you’re having a hard time reasoning about all of this, you are not alone!&lt;/p&gt;
&lt;p&gt;The reason we’ve spent a lot of time looking through these parts of the Postgres source code, is because they form the basis of a new upcoming version of the Index Advisor.&lt;/p&gt;
&lt;p&gt;And as part of the new Index Advisor, we’ll show you additional information for all scans on a table, to help you assess how Postgres uses existing indexes, and what the best indexing strategy might be.&lt;/p&gt;
&lt;p&gt;Here is a sneak peek from our current design iteration:&lt;/p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Upcoming pganalyze Scans UI&quot; title=&quot;Upcoming pganalyze Scans UI&quot; src=&quot;https://pganalyze.com/static/9dfa02d25d55a229d1fe898a27b1c2e7/1d69c/pganalyze_scans.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;p&gt;The same WHERE clause and JOIN clause data from the Postgres planner is shown in the Scans list, to help you make an assessment of how Postgres builds Plain Index Scans and Parameterized Index Scans for your queries.&lt;/p&gt;
&lt;p&gt;But more on this another day!&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this post we’ve gone down and chased through the Postgres source code until we’ve found the place where indexing decisions happen. We’ve looked at B-tree costing in particular, and looked at a puzzling case of how Nested Loops can affect index usage, by allowing the use of Parameterized Index Scans.&lt;/p&gt;
&lt;p&gt;If you optimize your queries, it helps to understand which tables you are scanning, and what the involved WHERE and JOIN clauses are. Additionally, it’s important to understand the different join types, and that only Nested Loop joins can make use of indexes on columns in the JOIN clause.&lt;/p&gt;
&lt;p&gt;Do you think your peers might be interested in this article? &lt;a href=&quot;https://twitter.com/intent/tweet?text=%E2%80%9DHow%20%23Postgres%20Chooses%20Which%20Index%20To%20Use%20For%20A%20Query%E2%80%9D%20-%20In%20this%20article%2C%20%40pganalyze%20explain%20how%20the%20Postgres%20planner%20breaks%20down%20a%20query%20into%20scans%20and%20how%20this%20impacts%20indexing%20choices%3A%20https%3A%2F%2Fpganalyze.com%2Fblog%2Fhow-postgres-chooses-index&quot;&gt;Share this on Twitter&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;other--helpful-resources&quot; &gt;&lt;a href=&quot;#other--helpful-resources&quot; aria-label=&quot;other  helpful resources permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Other  helpful resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/postgres-create-index&quot;&gt;Using Postgres CREATE INDEX: Understanding operator classes, index types &amp;#x26; more&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/gin-index&quot;&gt;Understanding Postgres GIN Indexes: The Good and the Bad&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/deconstructing-the-postgres-planner&quot;&gt;How we deconstructed the Postgres planner to find indexing opportunities&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/introducing-pganalyze-index-advisor&quot;&gt;A better way to index your Postgres database: pganalyze Index Advisor&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;Effective Indexing in Postgres eBook&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;Best Practices for Optimizing Postgres Query Performance eBook&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-optimizing-postgres-text-search-trigrams-gist-indexes&quot;&gt;5mins of Postgres E6: Optimizing Postgres Text Search with Trigrams and GiST indexes&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Postgres in 2021: An Observer's Year In Review]]></title><description><![CDATA[Every January, the pganalyze team takes time to sit down to reflect on the year gone by. Of course, we are thinking about pganalyze, our customers and how we can improve our product. But, more importantly, we always take a bird's-eye view at what has happened in our industry, and specifically in the Postgres community. As you can imagine: A lot! So we thought: Instead of trying to summarize everything, let's review what happened with the Postgres project, and what is most exciting from our…]]></description><link>https://pganalyze.com/blog/postgres-2021-year-in-review</link><guid isPermaLink="false">https://pganalyze.com/blog/postgres-2021-year-in-review</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Fri, 07 Jan 2022 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Every January, the pganalyze team takes time to sit down to reflect on the year gone by. Of course, we are thinking about pganalyze, our customers and how we can improve our product. But, more importantly, we always take a bird&apos;s-eye view at what has happened in our industry, and specifically in the Postgres community. As you can imagine: A lot!&lt;/p&gt;
&lt;p&gt;So we thought: Instead of trying to summarize everything, &lt;strong&gt;let&apos;s review what happened with the Postgres project, and what is most exciting from our personal perspective&lt;/strong&gt;. Coincidentally, a new Postgres &lt;a href=&quot;https://commitfest.postgresql.org/&quot;&gt;Commitfest&lt;/a&gt; has just started, so it&apos;s the perfect time to read about new functionality that is being proposed by the PostgreSQL community.&lt;/p&gt;
&lt;p&gt;The following are my own thoughts on the past year of Postgres, and a few of the things that I&apos;m excited about looking ahead. Let&apos;s take a look:&lt;/p&gt;
&lt;p &gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/01f82442d85bf3fb837f7eb6e385542b/ec605/postgres_2021_year_in_review_pganalyze.jpg&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Postgres: 2021 Year In Review&quot; title=&quot;Postgres: 2021 Year In Review&quot; src=&quot;https://pganalyze.com/static/01f82442d85bf3fb837f7eb6e385542b/acb04/postgres_2021_year_in_review_pganalyze.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#postgres-performance-sometimes-its-the-small-things&quot;&gt;Postgres Performance: Sometimes it&apos;s the small things&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#does-autovacuum-dream-of-64-bit-transaction-ids&quot;&gt;Does autovacuum dream of 64-bit Transaction IDs?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#explain-nested-loops-can-be-deceiving&quot;&gt;EXPLAIN: Nested Loops can be deceiving&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#extended-statistics-help-the-postgres-planner-do-its-job-better&quot;&gt;Extended Statistics: Help the Postgres planner do its job better&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#crustaceous-postgres-using-rust-for-extensions--more&quot;&gt;Crustaceous Postgres: Using Rust For Extensions &amp;#x26; more&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#other-highlights-from-postgres-development-in-2021&quot;&gt;Other highlights from Postgres development in 2021&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;postgres-performance-sometimes-its-the-small-things&quot; &gt;&lt;a href=&quot;#postgres-performance-sometimes-its-the-small-things&quot; aria-label=&quot;postgres performance sometimes its the small things permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Postgres Performance: Sometimes it&apos;s the small things&lt;/h2&gt;
&lt;p&gt;To start with, I wanted to look at one very specific change that I actually hadn&apos;t noticed until recently.&lt;/p&gt;
&lt;p&gt;Specifically: The performance of &lt;code &gt;IN&lt;/code&gt; clauses, and the work done to improve performance for long &lt;code &gt;IN&lt;/code&gt; lists in Postgres 14.&lt;/p&gt;
&lt;p&gt;First, let&apos;s set up a test table with some data that we can query:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; tbl &lt;span &gt;(&lt;/span&gt;
    id &lt;span &gt;int&lt;/span&gt;
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; tbl &lt;span &gt;SELECT&lt;/span&gt; i &lt;span &gt;FROM&lt;/span&gt; generate_series&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;&lt;span &gt;100000&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; n&lt;span &gt;(&lt;/span&gt;i&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, we run a very simple query with a long &lt;code &gt;IN&lt;/code&gt; list on Postgres 13:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres&lt;span &gt;=&lt;/span&gt;&lt;span &gt;# SELECT count(*) FROM tbl WHERE id IN ([... 1000 integer values ...]);&lt;/span&gt;
 count 
&lt;span &gt;-------&lt;/span&gt;
  &lt;span &gt;1000&lt;/span&gt;
&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt; &lt;span &gt;row&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;

&lt;span &gt;Time&lt;/span&gt;: &lt;span &gt;360.520&lt;/span&gt; ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is noticeably slow. With Postgres 14 however:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;postgres&lt;span &gt;=&lt;/span&gt;&lt;span &gt;# SELECT count(*) FROM tbl WHERE id IN ([... 1000 integer values ...]);&lt;/span&gt;
 count 
&lt;span &gt;-------&lt;/span&gt;
  &lt;span &gt;1000&lt;/span&gt;
&lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt; &lt;span &gt;row&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;

&lt;span &gt;Time&lt;/span&gt;: &lt;span &gt;12.246&lt;/span&gt; ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;An amazing 30x improvement!&lt;/strong&gt; Note that this is most pronounced with Sequential Scans, or other situations where the executor makes a lot of comparisons, i.e. when the expression shows up as a &lt;code &gt;Filter&lt;/code&gt; clause.&lt;/p&gt;
&lt;p&gt;The reason I like this change is that it demonstrates what the Postgres community does well: Refine the existing system,
and optimize clear inefficiencies, without requiring users to change their queries.&lt;/p&gt;
&lt;p&gt;Of course, there are many other exciting performance efforts, here are a few:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://pganalyze.com/blog/postgres-14-performance-monitoring#improved-active-and-idle-connection-scaling-in-postgres-14&quot;&gt;Connection scaling improvements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://blog.jooq.org/postgresql-14s-enable_memoize-for-improved-performance-of-nested-loop-joins/&quot;&gt;Memoization of Nested Loops&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://www.postgresql.org/docs/14/libpq-pipeline-mode.html&quot;&gt;libpq pipelining&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3499/&quot;&gt;libpq compression&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3316/&quot;&gt;Asynchronous I/O and direct I/O&lt;/a&gt; (see also this &lt;a href=&quot;https://speakerdeck.com/azuredbpostgres/asynchronous-io-for-postgresql-pgcon-2020-andres-freund&quot;&gt;presentation&lt;/a&gt; by Andres Freund)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;does-autovacuum-dream-of-64-bit-transaction-ids&quot; &gt;&lt;a href=&quot;#does-autovacuum-dream-of-64-bit-transaction-ids&quot; aria-label=&quot;does autovacuum dream of 64 bit transaction ids permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Does autovacuum dream of 64-bit Transaction IDs?&lt;/h2&gt;
&lt;p&gt;Now, on to a much bigger topic. If you&apos;ve scaled Postgres, you&apos;ve likely come to meet the archenemy of a large Postgres installation: VACUUM, or rather its cousin, autovacuum, which cleans up dead tuples from your tables and advances the transaction ID horizon in Postgres.&lt;/p&gt;
&lt;p&gt;Much has been said (&lt;a href=&quot;https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://blog.crunchydata.com/blog/managing-transaction-id-wraparound-in-postgresql&quot;&gt;2&lt;/a&gt;, &lt;a href=&quot;https://www.joyent.com/blog/manta-postmortem-7-27-2015&quot;&gt;3&lt;/a&gt;) about what happens when you hit &lt;strong&gt;Transaction ID (TXID) Wraparound&lt;/strong&gt;, a situation in which Postgres is unable to start a new transaction. A recent blog post illustrating Notion&apos;s motivation to shard their Postgres deployment, puts it well:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;More worrying was the prospect of transaction ID (TXID) wraparound, a safety mechanism in which Postgres would stop processing all writes to avoid clobbering existing data.
Realizing that TXID wraparound would pose an existential threat to the product, our infrastructure team doubled down and got to work.
&lt;br/&gt;&lt;br/&gt;
&lt;em&gt;- &lt;a href=&quot;https://www.notion.so/blog/sharding-postgres-at-notion&quot;&gt;Garrett Fidalgo - Herding elephants: Lessons learned from sharding Postgres at Notion&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The root cause here is actually very simple. Transaction IDs are stored as 32-bit integers in Postgres. For example on individual rows in the table, to identify when the row first became visible to other transactions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Most people would agree that moving from 32-bit to 64-bit Transaction IDs is a good idea.&lt;/strong&gt; There have been multiple attempts over the years, but in the last weeks a new patch by Maxim Orlov has kickstarted a &lt;a href=&quot;https://www.postgresql.org/message-id/flat/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com&quot;&gt;new discussion&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Whilst the community&apos;s motivation to fix this is certainly there, the early reviews give a glimpse of what needs to be considered when moving to 64-bit TXIDs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;32-bit systems will have issues with atomic read/write of shared transaction ID variables&lt;/li&gt;
&lt;li&gt;Extremely long-running transactions could fail if they exceed the new &quot;short transaction ID&quot; boundary (which remains at 32-bit in this patch)&lt;/li&gt;
&lt;li&gt;On-disk format - keeping compatibility with the old format vs rewriting all data when an old cluster is upgraded (this patch tries to avoid changing the on-disk format)&lt;/li&gt;
&lt;li&gt;Multixact freeze still needs to happen at a somewhat regular frequency (one of the activities that VACUUM takes care of today)&lt;/li&gt;
&lt;li&gt;Memory overhead of larger 64-bit IDs in hot code paths (e.g. those optimized by recent connection scalability improvements)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And Peter Geoghegan &lt;a href=&quot;https://www.postgresql.org/message-id/flat/CAH2-Wzk68iW_z0rb8VxEchQavHLPLPXv_Vkx954B%3DBmqSrL_mQ%40mail.gmail.com#4d0f09cc88ae1ee58ba3278e827a82dc&quot;&gt;puts it succinctly&lt;/a&gt; in reviewing the patch:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I believe that a good solution to the problem that this patch tries to
solve needs to be more ambitious. I think that we need to return to
first principles, rather than extending what we have already.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Despite the email thread being titled &quot;Add 64-bit XIDs into PostgreSQL 15&quot;, given these concerns, it&apos;s extremely unlikely that a change like this would make it into Postgres 15 at this point - but one can dream, and look ahead to Postgres 16.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Looking for something you can use today?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Postgres 14 brought two great improvements in the area of VACUUM and bloat reduction: (1) &lt;a href=&quot;https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION&quot;&gt;The new bottom-up index deletion for B-tree indexes&lt;/a&gt;, (2) The new VACUUM &quot;emergency mode&quot; that provides better protection against impeding TXID Wraparound.&lt;/p&gt;
&lt;h2 id=&quot;explain-nested-loops-can-be-deceiving&quot; &gt;&lt;a href=&quot;#explain-nested-loops-can-be-deceiving&quot; aria-label=&quot;explain nested loops can be deceiving permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;EXPLAIN: Nested Loops can be deceiving&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://wiki.postgresql.org/wiki/CommitFest&quot;&gt;Commitfests&lt;/a&gt; are about encouraging code reviews, first and foremost. Whilst looking through patches, I noticed a small one, which adds additional information about &lt;a href=&quot;https://commitfest.postgresql.org/36/2765/&quot;&gt;Nested Loops to EXPLAIN&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The patch was initially proposed back in 2020, and saw some minor refactorings in 2021, but no-one had reviewed it yet in this Commitfest. So I took the opportunity to &lt;a href=&quot;https://www.postgresql.org/message-id/flat/CAP53Pkw1d%2BsmuPvsVDecSnfphyZ46zrkSNjNEbSF3HA6-EsFkA%40mail.gmail.com#68986e7b42340acec3dc61f7af36cdf2&quot;&gt;review it&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;First, to understand what the patch aims to do, let&apos;s look at a common EXPLAIN output for a Nested Loop:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Nested Loop (actual rows=23 loops=1)
   Output: tbl1.col1, tprt.col1
   -&gt;  Seq Scan on public.tbl1 (actual rows=5 loops=1)
         Output: tbl1.col1
   -&gt;  Append (actual rows=5 loops=5)
         -&gt;  Index Scan using tprt1_idx on public.tprt_1 (actual rows=2 loops=5)
               Output: tprt_1.col1
               Index Cond: (tprt_1.col1 &amp;lt; tbl1.col1)
         -&gt;  Index Scan using tprt2_idx on public.tprt_2 (actual rows=3 loops=4)
               Output: tprt_2.col1
               Index Cond: (tprt_2.col1 &amp;lt; tbl1.col1)
         -&gt;  Index Scan using tprt3_idx on public.tprt_3 (actual rows=1 loops=2)
               Output: tprt_3.col1
               Index Cond: (tprt_3.col1 &amp;lt; tbl1.col1)
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Based on this we might assume that each loop produces 5 rows, as the existing &quot;actual rows&quot; statistic shows the average across all loops.&lt;/p&gt;
&lt;p&gt;But this example shows well where the math already doesn&apos;t add up: The parent &lt;a src=&quot;https://pganalyze.com/docs/explain/other-nodes/append&quot;&gt;Append&lt;/a&gt; node returns 5 rows on average, but the child node &quot;actual rows&quot; add up to 6. And the top &lt;a src=&quot;https://pganalyze.com/docs/explain/join-nodes/nested-loop&quot;&gt;Nested Loop&lt;/a&gt; node returns 23 rows, but we can&apos;t see clearly which index these rows are being found in.&lt;/p&gt;
&lt;p&gt;With the patch in place, we get an extra row with &lt;code &gt;Loop&lt;/code&gt; information:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Nested Loop (actual rows=23 loops=1)
   Output: tbl1.col1, tprt.col1
   -&gt;  Seq Scan on public.tbl1 (actual rows=5 loops=1)
         Output: tbl1.col1
   -&gt;  Append (actual rows=5 loops=5)
         Loop Min Rows: 2  Max Rows: 6  Total Rows: 23
         -&gt;  Index Scan using tprt1_idx on public.tprt_1 (actual rows=2 loops=5)
               Loop Min Rows: 2  Max Rows: 2  Total Rows: 10
               Output: tprt_1.col1
               Index Cond: (tprt_1.col1 &amp;lt; tbl1.col1)
         -&gt;  Index Scan using tprt2_idx on public.tprt_2 (actual rows=3 loops=4)
               Loop Min Rows: 2  Max Rows: 3  Total Rows: 11
               Output: tprt_2.col1
               Index Cond: (tprt_2.col1 &amp;lt; tbl1.col1)
         -&gt;  Index Scan using tprt3_idx on public.tprt_3 (actual rows=1 loops=2)
               Loop Min Rows: 1  Max Rows: 1  Total Rows: 2
               Output: tprt_3.col1
               Index Cond: (tprt_3.col1 &amp;lt; tbl1.col1)
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can see how much clearer the picture is with this. We can understand that both &lt;code &gt;tprt1_idx&lt;/code&gt; and &lt;code &gt;tprt2_idx&lt;/code&gt; contributed about equally to the result. We can also see that some loop iterations have smaller row counts (2), vs other iterations have higher counts (6). When &lt;code &gt;TIMING&lt;/code&gt; is turned on, you also get information on the min/max time of the loop iterations.&lt;/p&gt;
&lt;p&gt;Given the prevalance of slow query plans that contain a Nested Loop, this appears to be a very useful patch. The main open item with this patch appears to be the slight overhead caused by collecting additional statistics - something to be discussed further on the mailinglist.&lt;/p&gt;
&lt;p&gt;Interested in other EXPLAIN improvements? Here&apos;s what happened recently:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://pganalyze.com/blog/postgres-14-performance-monitoring#monitor-queries-with-the-built-in-postgres-query_id&quot;&gt;pg_stat_statements queryid is now built into core&lt;/a&gt;, and shows in EXPLAIN output&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3050/&quot;&gt;Showing applied extended statistics in EXPLAIN&lt;/a&gt; (to quote my colleague Maciek: &quot;Oh neat, that&apos;s pretty cool.&quot;)&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3298/&quot;&gt;Showing I/O timings spent reading/writing temp buffers in EXPLAIN&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;extended-statistics-help-the-postgres-planner-do-its-job-better&quot; &gt;&lt;a href=&quot;#extended-statistics-help-the-postgres-planner-do-its-job-better&quot; aria-label=&quot;extended statistics help the postgres planner do its job better permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Extended Statistics: Help the Postgres planner do its job better&lt;/h2&gt;
&lt;p&gt;Going back to what you can use today: Extended Statistics on Expressions, released in Postgres 14.&lt;/p&gt;
&lt;p&gt;Let&apos;s back up there for a moment. If you are not familiar, &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-createstatistics.html&quot;&gt;extended statistics&lt;/a&gt; allow you to collect additional statistics about table contents, so the Postgres planner can provide better query plans.&lt;/p&gt;
&lt;p&gt;The general syntax is like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;STATISTICS&lt;/span&gt; &lt;span &gt;[&lt;/span&gt; &lt;span &gt;IF&lt;/span&gt; &lt;span &gt;NOT&lt;/span&gt; &lt;span &gt;EXISTS&lt;/span&gt; &lt;span &gt;]&lt;/span&gt; statistics_name
    &lt;span &gt;[&lt;/span&gt; &lt;span &gt;(&lt;/span&gt; statistics_kind &lt;span &gt;[&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt; &lt;span &gt;]&lt;/span&gt; &lt;span &gt;)&lt;/span&gt; &lt;span &gt;]&lt;/span&gt;
    &lt;span &gt;ON&lt;/span&gt; column_name&lt;span &gt;,&lt;/span&gt; column_name &lt;span &gt;[&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;
    &lt;span &gt;FROM&lt;/span&gt; table_name&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Before Postgres 14 you could already create extended statistics that help the planner understand the correlation between two columns, which often times is necessary to avoid selectivity mis-estimates.&lt;/p&gt;
&lt;p&gt;With the new extended statistics for expressions, you can inform the planner how selective a particular expression is, which in turn leads to better query plans. Here is an example of how to use this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; tbl &lt;span &gt;(&lt;/span&gt;
    a timestamptz
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;STATISTICS&lt;/span&gt; st &lt;span &gt;ON&lt;/span&gt; date_trunc&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;month&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; a&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; tbl&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will cause Postgres to not only collect statistics about &lt;code &gt;a&lt;/code&gt; itself (which it does by default), but also the expression that uses the &lt;code &gt;date_trunc&lt;/code&gt; function, and what the statistics of results of that expression are. You can find a complete example in the &lt;a href=&quot;https://www.postgresql.org/docs/14/sql-createstatistics.html&quot;&gt;Postgres docs&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In addition to this, there are many changes in-flight that are being discussed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3245/&quot;&gt;Improve selectivity estimates when extended statistics are present&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2831/&quot;&gt;Extended statistics for Var op Var clauses / Expr op Expr&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3055/&quot;&gt;Estimating JOINs using extended statistics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;crustaceous-postgres-using-rust-for-extensions--more&quot; &gt;&lt;a href=&quot;#crustaceous-postgres-using-rust-for-extensions--more&quot; aria-label=&quot;crustaceous postgres using rust for extensions  more permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Crustaceous Postgres: Using Rust For Extensions &amp;#x26; more&lt;/h2&gt;
&lt;p&gt;A side topic that isn&apos;t actually about Postgres development itself, but still pretty exciting on a larger scale: &lt;strong&gt;Postgres and Rust&lt;/strong&gt;. As you probably know, Postgres itself is written in C, and that is unlikely to change.&lt;/p&gt;
&lt;p&gt;However there are two great examples of Rust being used to augment the Postgres ecosystem.&lt;/p&gt;
&lt;p&gt;First, you can write Postgres extensions in Rust using &lt;a href=&quot;https://github.com/zombodb/pgx&quot;&gt;pgx&lt;/a&gt;, and by now this approach has matured to the point that even established extension authors such as the TimescaleDB team have started adopting Rust for some of their projects, such as the &lt;a href=&quot;https://github.com/timescale/timescaledb-toolkit&quot;&gt;TimescaleDB toolkit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Second, there are new systems being developed that build on Postgres, that utilize Rust as their language of choice, e.g. for networked services. The most interesting development in 2021 in this regard is the work of the team at &lt;a href=&quot;https://github.com/zenithdb/zenith&quot;&gt;ZenithDB&lt;/a&gt;, that is working on an Apache 2.0-licensed variant of a shared disk-type scale-out architecture (similar to Amazon Aurora), built on Postgres, with services written in Rust.&lt;/p&gt;
&lt;h2 id=&quot;other-highlights-from-postgres-development-in-2021&quot; &gt;&lt;a href=&quot;#other-highlights-from-postgres-development-in-2021&quot; aria-label=&quot;other highlights from postgres development in 2021 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Other highlights from Postgres development in 2021&lt;/h2&gt;
&lt;p&gt;A single post could never do everything justice, but here are a few more things that caught my attention:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Better Security:&lt;/strong&gt; No more (unexpected) &lt;a href=&quot;https://www.depesz.com/2021/09/10/waiting-for-postgresql-15-revoke-public-create-from-public-schema-now-owned-by-pg_database_owner/&quot;&gt;creation of objects in the public schema&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What&apos;s in your JSONB, really?&lt;/strong&gt; Find out by &lt;a href=&quot;https://commitfest.postgresql.org/36/3500/&quot;&gt;Collecting statistics about JSONB columns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ANALYZE + Partitioning:&lt;/strong&gt; Did you know partitioned parent tables may need manual ANALYZE? &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e1efc5b465c844969a0ed0d07e1364f3ce424d8c&quot;&gt;With Postgres 14 it&apos;s easier to keep track of it&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Filtering Logical Replication:&lt;/strong&gt; Want to filter your data &lt;a href=&quot;https://commitfest.postgresql.org/36/3230/&quot;&gt;by columns&lt;/a&gt;, or &lt;a href=&quot;https://commitfest.postgresql.org/36/2906/&quot;&gt;by rows&lt;/a&gt;? Postgres 15 may allow you to do both!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And even more things that are pretty cool:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://www.postgresql.org/docs/14/runtime-config-connection.html#GUC-PASSWORD-ENCRYPTION&quot;&gt;SCRAM-SHA-256 is now the default for password encryption&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://www.cybertec-postgresql.com/en/finally-a-system-level-read-all-data-role-for-postgresql/&quot;&gt;pg_read_all_data and pg_write_all_data roles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3138/&quot;&gt;Support for NSS as a libpq TLS backend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3414/&quot;&gt;Non-superuser subscription owners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3458/&quot;&gt;Support issuing SSL certificates for multiple IP addresses&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;JSON(B)&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;https://blog.crunchydata.com/blog/better-json-in-postgres-with-postgresql-14&quot;&gt;JSON subscript operators&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a@sigaev.ru&quot;&gt;Custom TOASTer for JSONB, and Pluggable Toast&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2902/&quot;&gt;JSON_TABLE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2901/&quot;&gt;SQL/JSON&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2482/&quot;&gt;jsonpath syntax extensions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Partitioning&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3052/&quot;&gt;Merging statistics from partition children instead of re-sampling everything&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2815/&quot;&gt;CREATE INDEX CONCURRENTLY on partitioned table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3071/&quot;&gt;Lazy JIT IR code generation to increase JIT speed with partitions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3478/&quot;&gt;AcquireExecutorLocks() and run-time pruning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/31/2694/&quot;&gt;Automatic partition creation&lt;/a&gt; (sadly this patch has no recent progress)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Logical Replication&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Postgres 14: &lt;a href=&quot;http://amitkapila16.blogspot.com/2021/09/logical-replication-improvements-in.html&quot;&gt;Multiple improvements &amp;#x26; performance fixes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Postgres 15: &lt;a href=&quot;https://www.depesz.com/2021/11/16/waiting-for-postgresql-15-allow-publishing-the-tables-of-schema/&quot;&gt;Allow publishing all tables in a schema&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Postgres 15: &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8d74fc96db5fd547e077bf9bf4c3b67f821d71cd&quot;&gt;pg_stat_subscription_workers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/2968/&quot;&gt;Logical Decoding on standbys&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3393/&quot;&gt;Synchronize Logical Replication slots to standbys&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;In Development: &lt;a href=&quot;https://commitfest.postgresql.org/36/3155/&quot;&gt;Logical decoding and replication of sequences&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&amp;#x26; more :)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You may also enjoy &lt;a href=&quot;https://commitfest.postgresql.org/36/&quot;&gt;looking at the current Commitfest&lt;/a&gt;, to make up your own mind.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The above might feel quite extensive, but that&apos;s not merely all of the things that have happened with Postgres in 2021. I&apos;m excited to be part of such a vibrant community contributing to making Postgres continuously better and am eager to see what&apos;s to come for Postgres in 2022!&lt;/p&gt;
&lt;p&gt;At pganalyze we&apos;re committed to providing the best &lt;strong&gt;Postgres monitoring and observability&lt;/strong&gt; to help you uncover deep insights about Postgres performance. Whether your Postgres runs in the cloud, your on-premises data center, or a Raspberry Pi: &lt;a href=&quot;https://app.pganalyze.com/users/sign_up&quot;&gt;You can give pganalyze a try&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/intent/tweet?text=%E2%80%9DPostgres%20in%202021%3A%20An%20Observer%27s%20Year%20In%20Review%E2%80%9D%20-%20In%20this%20article%2C%20%40pganalyze%20take%20a%20look%20at%20Postgres%2014%2C%20explaining%20nested%20loops%2C%20extended%20statistics%20for%20expressions%2C%2064-bit%20transaction%20IDs%2C%20and%20more%20exciting%20Postgres%20patches%20from%202021%3A%20https%3A%2F%2Fpganalyze.com%2Fblog%2Fpostgres-2021-year-in-review&quot;&gt;Share this on Twitter&lt;/a&gt;&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Understanding Postgres GIN Indexes: The Good and the Bad]]></title><description><![CDATA[Adding, tuning and removing indexes is an essential part of maintaining an application that uses a database. Oftentimes, our applications rely on sophisticated database features and data types, such as JSONB, array types or full text search in Postgres. A simple B-tree index does not work in such situations, for example to index a JSONB column. Instead, we need to look beyond, to GIN indexes. Almost 15 years ago to the dot, GIN indexes were added in Postgres 8.2, and they have since become an…]]></description><link>https://pganalyze.com/blog/gin-index</link><guid isPermaLink="false">https://pganalyze.com/blog/gin-index</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 02 Dec 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Diagram of GIN index structure&quot; title=&quot;Diagram of GIN index structure&quot; src=&quot;https://pganalyze.com/static/718f52cb037c0a56a45cb32a73db791e/1d69c/gin_diagram.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Adding, tuning and removing indexes is an essential part of maintaining an application that uses a database. Oftentimes, our applications rely on sophisticated database features and data types, such as JSONB, array types or full text search in Postgres. &lt;strong&gt;A simple B-tree index does not work in such situations, for example to index a JSONB column&lt;/strong&gt;. Instead, we need to look beyond, to GIN indexes.&lt;/p&gt;
&lt;p&gt;Almost 15 years ago to the dot, &lt;a href=&quot;http://www.sai.msu.su/~megera/wiki/Gin&quot;&gt;GIN indexes were added in Postgres 8.2&lt;/a&gt;, and they have since become an essential tool in the application DBA’s toolbox. GIN indexes can seem like magic, as they can index what a normal B-tree cannot, such as JSONB data types and full text search. With this great power comes great responsibility, as GIN indexes can have adverse effects if used carelessly.&lt;/p&gt;
&lt;p&gt;In this article, we’ll take an in-depth look at GIN indexes in Postgres, building on, and referencing many great articles that have been written over the years by the community. We’ll start by reviewing &lt;strong&gt;what GIN indexes can do, how they are structured, and their most common use cases&lt;/strong&gt;, such as for indexing JSONB columns, or to support &lt;a href=&quot;https://pganalyze.com/ebooks/efficient-search-in-rails-with-postgres&quot;&gt;Postgres full text search&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But, understanding the fundamentals is only part of the puzzle. It’s much better when we can also learn from real world examples on busy databases. We’ll review a specific situation that the GitLab database team found themselves in this year, as it relates to write overhead caused by GIN indexes on a busy table with more than 1000 updates per minute.&lt;/p&gt;
&lt;p&gt;And we’ll conclude with a review of the trade-offs between the GIN write overhead and the possible performance gains. Plus: We’ve added support for GIN index recommendations to the pganalyze Index Advisor.&lt;/p&gt;
&lt;p&gt;To start with, let’s review what a GIN index looks like:&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#gin-index-in-postgres-what-is-it-actually&quot;&gt;GIN Index in Postgres: What is it actually?&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#indexing-tsvector-columns-for-postgres-full-text-search&quot;&gt;Indexing tsvector columns for Postgres full text search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#indexing-like-searches-with-trigrams-and-gin_trgm_ops&quot;&gt;Indexing LIKE searches with Trigrams and gin_trgm_ops&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#postgresql-jsonb-and-gin-indexes&quot;&gt;PostgreSQL, JSONB and GIN Indexes&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#postgres-gin-index-for-jsonb-columns-using-jsonb_ops-and-jsonb_path_ops&quot;&gt;Postgres GIN index for JSONB columns using jsonb_ops and jsonb_path_ops&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#multi-column-gin-indexes-and-combining-gin-and-b-tree-indexes&quot;&gt;Multi-Column GIN Indexes, and Combining GIN and B-tree indexes&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#the-downside-of-gin-indexes-expensive-updates&quot;&gt;The downside of GIN Indexes: Expensive Updates&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#gin-trigram-indexes-a-lesson-from-gitlab&quot;&gt;GIN trigram indexes: A lesson from GitLab&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#measuring-gin-pending-list-overhead-and-size&quot;&gt;Measuring GIN pending list overhead and size&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#strategies-for-dealing-with-gin-pending-list-update-issues&quot;&gt;Strategies for dealing with GIN pending list update issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#gin-index-support-in-the-pganalyze-index-advisor&quot;&gt;GIN index support in the pganalyze Index Advisor&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#other-helpful-resources&quot;&gt;Other helpful resources&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;gin-index-in-postgres-what-is-it-actually&quot; &gt;&lt;a href=&quot;#gin-index-in-postgres-what-is-it-actually&quot; aria-label=&quot;gin index in postgres what is it actually permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;GIN Index in Postgres: What is it actually?&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;“The GIN index type was designed to &lt;strong&gt;deal with data types that are subdividable and you want to search for individual component values&lt;/strong&gt; (array elements, lexemes in a text document, etc)” - &lt;a href=&quot;https://www.postgresql.org/message-id/flat/26038.1559516834%40sss.pgh.pa.us#ccb004aefc151d913e7a274a9b30c631&quot;&gt;Tom Lane&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The GIN index type was initially created by Teodor Sigaev and Oleg Bartunov, first released in Postgres 8.2, on December 5, 2006 - almost 15 years ago. Since then, GIN has seen many improvements, but the fundamental structure remains similar. GIN stands for &quot;Generalized Inverted iNdex&quot;. &quot;Inverted&quot; refers to the way that the index structure is set up, building a table-encompassing tree of all column values, where a single row can be represented in &lt;strong&gt;many places&lt;/strong&gt; within the tree. By comparison, a B-tree index generally has &lt;strong&gt;one location&lt;/strong&gt; where an index entry points to a specific row.&lt;/p&gt;
&lt;p&gt;Another way of explaining GIN indexes comes from a &lt;a href=&quot;https://wiki.postgresql.org/images/2/25/Full-text_search_in_PostgreSQL_in_milliseconds-extended-version.pdf&quot;&gt;presentation by Oleg Bartunov and Alexander Korotkov&lt;/a&gt; at PGConf.EU 2012 in Prague. They describe a GIN index like the table of contents in a book, where the heap pointers (to the actual table) are the page numbers. Multiple entries can be combined to yield a specific result, like the search for “compensation accelerometers” in this example:&lt;/p&gt;
&lt;p &gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Example of how GIN is structured like a book&apos;s table of contents&quot; title=&quot;Example of how GIN is structured like a book&apos;s table of contents&quot; src=&quot;https://pganalyze.com/static/888d381b466ef22724d3053f47c7a4f1/1d69c/gin_table_of_contents.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;It’s important to note that the exact mapping of a column of a given data type is dependent on the GIN index operator class. That means, instead of having a uniform representation of data in the index, like with B-trees, a GIN index can have very different index contents depending on which data type and operator class you are using. Some data types, such as JSONB have more than one GIN operator class to support the most optimal index structure for specific query patterns.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Before we move on, one more thing to know:&lt;/strong&gt; GIN indexes only support Bitmap Index Scans (not Index Scan or Index Only Scan), due to the fact that they only store parts of the row values in each index page. Don’t be surprised when EXPLAIN always shows Bitmap Index / Heap Scans for your GIN indexes.&lt;/p&gt;
&lt;p&gt;Let’s take a look at a few examples:&lt;/p&gt;
&lt;h3 id=&quot;indexing-tsvector-columns-for-postgres-full-text-search&quot; &gt;&lt;a href=&quot;#indexing-tsvector-columns-for-postgres-full-text-search&quot; aria-label=&quot;indexing tsvector columns for postgres full text search permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Indexing tsvector columns for Postgres full text search&lt;/h3&gt;
&lt;p&gt;The initial motivation for GIN indexes was full text search. Before GIN was added, there was no way to index full text search in Postgres, instead requiring a very slow sequential scan of the table.&lt;/p&gt;
&lt;p&gt;We’ve previously written about &lt;a src=&quot;https://pganalyze.com/blog/full-text-search-django-postgres&quot;&gt;Postgres full text search with Django&lt;/a&gt;, as well as how to do it with &lt;a src=&quot;https://pganalyze.com/blog/full-text-search-ruby-rails-postgres&quot;&gt;Ruby on Rails&lt;/a&gt; on the pganalyze blog.&lt;/p&gt;
&lt;p&gt;A simple example for a full text search index looks like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; pgweb_idx &lt;span &gt;ON&lt;/span&gt; pgweb &lt;span &gt;USING&lt;/span&gt; GIN &lt;span &gt;(&lt;/span&gt;to_tsvector&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;english&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; body&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This uses an expression index to create a GIN index that contains the indexed tsvector values for each row. You can then query like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; title
&lt;span &gt;FROM&lt;/span&gt; pgweb
&lt;span &gt;WHERE&lt;/span&gt; to_tsvector&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;english&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; body&lt;span &gt;)&lt;/span&gt; @@ to_tsquery&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;english&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&apos;friend&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As described in the &lt;a href=&quot;https://www.postgresql.org/docs/current/textsearch-indexes.html&quot;&gt;Postgres documentation&lt;/a&gt;, the tsvector GIN index structure is focused on lexemes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“GIN indexes are the preferred text search index type. As inverted indexes, they contain an index entry for each word (lexeme), with a compressed list of matching locations. Multi-word searches can find the first match, then use the index to remove rows that are lacking additional words.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;GIN indexes are the best starting point when using Postgres Full Text Search. There are situations where a GIST index might be preferred (see the &lt;a href=&quot;https://www.postgresql.org/docs/14/textsearch-indexes.html&quot;&gt;Postgres documentation&lt;/a&gt; for details), and if you run your own server you could also consider the newer &lt;a href=&quot;https://github.com/postgrespro/rum&quot;&gt;RUM index types&lt;/a&gt; available through an extension.&lt;/p&gt;
&lt;p&gt;Let&apos;s see what else GIN has to offer:&lt;/p&gt;
&lt;h3 id=&quot;indexing-like-searches-with-trigrams-and-gin_trgm_ops&quot; &gt;&lt;a href=&quot;#indexing-like-searches-with-trigrams-and-gin_trgm_ops&quot; aria-label=&quot;indexing like searches with trigrams and gin_trgm_ops permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Indexing LIKE searches with Trigrams and gin_trgm_ops&lt;/h3&gt;
&lt;p&gt;Sometimes Full Text Search isn&apos;t the right fit, but you find yourself needing to index a LIKE search on a particular column:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; test_trgm &lt;span &gt;(&lt;/span&gt;t &lt;span &gt;text&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test_trgm &lt;span &gt;WHERE&lt;/span&gt; t &lt;span &gt;LIKE&lt;/span&gt; &lt;span &gt;&apos;%foo%bar&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Due to the nature of the LIKE operation, which supports arbitrary wildcard expressions, this is fundamentally hard to index. However, the &lt;code &gt;pg_trgm&lt;/code&gt; extension can help. When you create an index like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; trgm_idx &lt;span &gt;ON&lt;/span&gt; test_trgm &lt;span &gt;USING&lt;/span&gt; gin &lt;span &gt;(&lt;/span&gt;t gin_trgm_ops&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Postgres will split the row values into trigrams, allowing indexed searches:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test_trgm &lt;span &gt;WHERE&lt;/span&gt; t &lt;span &gt;LIKE&lt;/span&gt; &lt;span &gt;&apos;%foo%bar&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                               QUERY PLAN                               
------------------------------------------------------------------------
 Bitmap Heap Scan on test_trgm  (cost=16.00..20.02 rows=1 width=32)
   Recheck Cond: (t ~~ &apos;%foo%bar&apos;::text)
   -&gt;  Bitmap Index Scan on trgm_idx  (cost=0.00..16.00 rows=1 width=0)
         Index Cond: (t ~~ &apos;%foo%bar&apos;::text)
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Effectiveness of this method varies with the exact data set. But when it works, it can speed up searches on arbitrary text data quite significantly.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: Effective Indexing in Postgres&quot;
        title=&quot;Download Free eBook: Effective Indexing in Postgres&quot;
        src=&quot;https://pganalyze.com/static/97b01777597bdcba8b1803935f1b7da0/acb04/ebook_promo_postgres_create_index.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;postgresql-jsonb-and-gin-indexes&quot; &gt;&lt;a href=&quot;#postgresql-jsonb-and-gin-indexes&quot; aria-label=&quot;postgresql jsonb and gin indexes permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;PostgreSQL, JSONB and GIN Indexes&lt;/h2&gt;
&lt;p&gt;JSONB was added to Postgres almost 10 years after GIN indexes were introduced - and it shows the flexibility of the GIN index type that they are the preferred way to index JSONB columns.&lt;/p&gt;
&lt;h3 id=&quot;postgres-gin-index-for-jsonb-columns-using-jsonb_ops-and-jsonb_path_ops&quot; &gt;&lt;a href=&quot;#postgres-gin-index-for-jsonb-columns-using-jsonb_ops-and-jsonb_path_ops&quot; aria-label=&quot;postgres gin index for jsonb columns using jsonb_ops and jsonb_path_ops permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Postgres GIN index for JSONB columns using jsonb_ops and jsonb_path_ops&lt;/h3&gt;
&lt;p&gt;With JSONB in Postgres we gain the flexibility of not having to define our schema upfront, but instead we can dynamically add data to a column in our table in JSON format.&lt;/p&gt;
&lt;p&gt;The most basic GIN index example for JSONB looks like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; test &lt;span &gt;(&lt;/span&gt;
  id bigserial &lt;span &gt;PRIMARY&lt;/span&gt; &lt;span &gt;KEY&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  &lt;span &gt;data&lt;/span&gt; jsonb
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; test&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;VALUES&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;{&quot;field&quot;: &quot;value1&quot;}&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; test&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;VALUES&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;{&quot;field&quot;: &quot;value2&quot;}&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;INSERT&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; test&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;VALUES&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;{&quot;other_field&quot;: &quot;value42&quot;}&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; test &lt;span &gt;USING&lt;/span&gt; gin&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see with EXPLAIN, this is able to use the index, for example when querying for all rows that have the field key defined:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;data&lt;/span&gt; ? &lt;span &gt;&apos;field&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=8.00..12.01 rows=1 width=40)
   Recheck Cond: (data ? &apos;field&apos;::text)
   -&gt;  Bitmap Index Scan on test_data_idx  (cost=0.00..8.00 rows=1 width=0)
         Index Cond: (data ? &apos;field&apos;::text)
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The way this gets stored is based on the keys and values of the JSONB data. In the above test data, the default &lt;code &gt;jsonb_ops&lt;/code&gt; operator class would store the following values in the GIN index, as separate entries: &lt;code &gt;field&lt;/code&gt;, &lt;code &gt;other_field&lt;/code&gt;, &lt;code &gt;value1&lt;/code&gt;, &lt;code &gt;value2&lt;/code&gt;, &lt;code &gt;value42&lt;/code&gt;. Depending on the search the GIN index will combine multiple index entries to satisfy the specific query conditions.&lt;/p&gt;
&lt;p&gt;Now, we can also use the non-default &lt;code &gt;jsonb_path_ops&lt;/code&gt; operator class with a JSONB GIN index. This uses an optimized GIN index structure that would instead store the above data as three individual entries using a hash function: &lt;code &gt;hashfn(field, value1)&lt;/code&gt;, &lt;code &gt;hashfn(field, value2)&lt;/code&gt; and &lt;code &gt;hashfn(other_field, value42)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code &gt;jsonb_path_ops&lt;/code&gt; class is intended to efficiently support containment queries. First we specify the operator class during index creation:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; test &lt;span &gt;USING&lt;/span&gt; gin&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt; jsonb_path_ops&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And then we can use it for queries such as the following:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;data&lt;/span&gt; @&lt;span &gt;&gt;&lt;/span&gt; &lt;span &gt;&apos;{&quot;field&quot;: &quot;value1&quot;}&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=8.00..12.01 rows=1 width=40)
   Recheck Cond: (data @&gt; &apos;{&quot;field&quot;: &quot;value1&quot;}&apos;::jsonb)
   -&gt;  Bitmap Index Scan on test_data_idx1  (cost=0.00..8.00 rows=1 width=0)
         Index Cond: (data @&gt; &apos;{&quot;field&quot;: &quot;value1&quot;}&apos;::jsonb)
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see it’s easy to index a JSONB column. Note that you could technically also index JSONB with other index types by taking specific parts of the data. For example, we could use a B-tree expression index to index the field keys:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; test &lt;span &gt;USING&lt;/span&gt; &lt;span &gt;btree&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt; &lt;span &gt;-&lt;/span&gt;&lt;span &gt;&gt;&gt;&lt;/span&gt; &lt;span &gt;&apos;field&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The Postgres query planner will then use the specific expression index behind the scenes, if your query matches the expression:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;data&lt;/span&gt;&lt;span &gt;-&lt;/span&gt;&lt;span &gt;&gt;&gt;&lt;/span&gt;&lt;span &gt;&apos;field&apos;&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;value1&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                 QUERY PLAN                                 
---------------------------------------------------------------------------
 Index Scan using test_expr_idx on test  (cost=0.13..8.15 rows=1 width=40)
   Index Cond: ((data -&gt;&gt; &apos;field&apos;::text) = &apos;value1&apos;::text)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There is one more thing we should look at with finding the right GIN index, and that is multi-column GIN indexes.&lt;/p&gt;
&lt;h2 id=&quot;multi-column-gin-indexes-and-combining-gin-and-b-tree-indexes&quot; &gt;&lt;a href=&quot;#multi-column-gin-indexes-and-combining-gin-and-b-tree-indexes&quot; aria-label=&quot;multi column gin indexes and combining gin and b tree indexes permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Multi-Column GIN Indexes, and Combining GIN and B-tree indexes&lt;/h2&gt;
&lt;p&gt;Often times you’ll have queries that filter on a column that uses a data type that’s ideal for GIN indexes, such as JSONB, but you are also filtering on another column, that is more of a typical B-tree index candidate:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; records &lt;span &gt;(&lt;/span&gt;
  id bigserial &lt;span &gt;PRIMARY&lt;/span&gt; &lt;span &gt;KEY&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  customer_id int4&lt;span &gt;,&lt;/span&gt;
  &lt;span &gt;data&lt;/span&gt; jsonb
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; records &lt;span &gt;WHERE&lt;/span&gt; customer_id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;123&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;data&lt;/span&gt; @&lt;span &gt;&gt;&lt;/span&gt; &lt;span &gt;&apos;{ &quot;location&quot;: &quot;New York&quot; }&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In addition you might have a query like the following:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; records &lt;span &gt;WHERE&lt;/span&gt; customer_id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;123&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And you are considering which index to create for the two queries combined.&lt;/p&gt;
&lt;p&gt;There are two fundamental strategies you can take:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;(1) Create two separate indexes, one on &lt;code &gt;customer_id&lt;/code&gt; using a B-tree, and one on &lt;code &gt;data&lt;/code&gt; using GIN
&lt;ul&gt;
&lt;li&gt;In this situation, for the first query, Postgres might use BitmapAnd to combine the index search results from both indexes to find the affected rows&lt;/li&gt;
&lt;li&gt;Whilst the idea of using two separate indexes sounds great in theory, in practice it often turns out to be the worse performing option. You can find some discussions about this on the &lt;a href=&quot;https://www.postgresql.org/message-id/flat/56B332B6.1040109%40promani.be&quot;&gt;Postgres mailing lists&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;(2) Create one multi-column GIN index on both &lt;code &gt;customer_id&lt;/code&gt; and &lt;code &gt;data&lt;/code&gt;
&lt;ul&gt;
&lt;li&gt;Note that multi-column GIN indexes don’t help much with making the index more effective, but they can help cover multiple queries with the same index&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For implementing the second strategy, we need the help of the “btree_gin” extension in Postgres (part of contrib) that contains operator classes for data types that are not subdividable.&lt;/p&gt;
&lt;p&gt;You can create the extension and the multi-column index like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; EXTENSION btree_gin&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; records &lt;span &gt;USING&lt;/span&gt; gin &lt;span &gt;(&lt;/span&gt;&lt;span &gt;data&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; customer_id&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that index column order does not matter for GIN indexes. And as we can see, this gets used during query planning:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; records &lt;span &gt;WHERE&lt;/span&gt; customer_id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;123&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;data&lt;/span&gt; @&lt;span &gt;&gt;&lt;/span&gt; &lt;span &gt;&apos;{ &quot;location&quot;: &quot;New York&quot; }&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Bitmap Heap Scan on records  (cost=16.01..20.03 rows=1 width=41)
   Recheck Cond: ((customer_id = 123) AND (data @&gt; &apos;{&quot;location&quot;: &quot;New York&quot;}&apos;::jsonb))
   -&gt;  Bitmap Index Scan on records_customer_id_data_idx  (cost=0.00..16.01 rows=1 width=0)
         Index Cond: ((customer_id = 123) AND (data @&gt; &apos;{&quot;location&quot;: &quot;New York&quot;}&apos;::jsonb))
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It’s rather uncommon to use multi-column GIN indexes, but depending on your workload it might make sense. Remember that larger indexes mean more I/O, making index lookups slower, and writes more expensive.&lt;/p&gt;
&lt;h2 id=&quot;the-downside-of-gin-indexes-expensive-updates&quot; &gt;&lt;a href=&quot;#the-downside-of-gin-indexes-expensive-updates&quot; aria-label=&quot;the downside of gin indexes expensive updates permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The downside of GIN Indexes: Expensive Updates&lt;/h2&gt;
&lt;p&gt;As you saw in the examples above, GIN indexes are special because they often contain multiple index entries per single row that is being inserted. This is essential to enable the use cases that GIN supports, but causes one significant problem: Updating the index is expensive.&lt;/p&gt;
&lt;p&gt;Due to the fact that a single row can cause 10s or worst case 100s of index entries to be updated, it’s important to understand the special &lt;code &gt;fastupdate&lt;/code&gt; mechanism of GIN indexes.&lt;/p&gt;
&lt;p&gt;By default &lt;code &gt;fastupdate&lt;/code&gt; is enabled for GIN indexes, and it causes index updates to be deferred, so they can occur at a point where multiple updates have to be made, reducing the overhead for a single UPDATE, at the expense of having to do the work at a later point.&lt;/p&gt;
&lt;p&gt;The data that is deferred is kept in the special &lt;strong&gt;pending list&lt;/strong&gt;, which then gets flushed to the main index structure in one of three situations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;code &gt;gin_pending_list_limit&lt;/code&gt; (default of 4MB) is reached during a regular index update&lt;/li&gt;
&lt;li&gt;Explicit call to the &lt;code &gt;gin_clean_pending_list&lt;/code&gt; function&lt;/li&gt;
&lt;li&gt;Autovacuum on the table with the GIN index (GIN pending list cleanup happens at the end of vacuum)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As you can imagine this can be quite an expensive operation, which is why one symptom of index write overhead with GIN can be that every Nth INSERT or UPDATE statement suddenly is a lot slower, in case you run into the first scenario above, where the &lt;code &gt;gin_pending_list_limit&lt;/code&gt; is reached.&lt;/p&gt;
&lt;p&gt;This exact situation happened to the team at GitLab recently. Let’s look at a real life example of where GIN updates became a problem.&lt;/p&gt;
&lt;h3 id=&quot;gin-trigram-indexes-a-lesson-from-gitlab&quot; &gt;&lt;a href=&quot;#gin-trigram-indexes-a-lesson-from-gitlab&quot; aria-label=&quot;gin trigram indexes a lesson from gitlab permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;GIN trigram indexes: A lesson from GitLab&lt;/h3&gt;
&lt;p&gt;The team at GitLab often publishes their discussions of database optimizations publicly, and we can learn a lot from these interactions. &lt;a href=&quot;https://gitlab.com/gitlab-org/gitlab/-/issues/336930&quot;&gt;A recent example discussed&lt;/a&gt; a GIN trigram index that caused merge requests to be quite slow occasionally:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“We can see there are a number of slow updates for updating a merge request. The interesting thing here is that we see very little locking statements (locking is logged after 5 seconds waiting), which suggests something else is occurring to make these slow.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was determined to be caused by the GIN pending list:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Anecdotally, cleaning the gin index pending-list for the description field on the merge_requests table can cost multiple seconds.  The overhead does increase when there are more pending entries to write to the index.  In this informal survey of manually running gin_clean_pending_list( &apos;index_merge_requests_on_description_trigram&apos;::regclass ) the duration varied between 465 ms and 3155 ms.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The team further investigated, and determined that the GIN pending list was flushed a very high number of times during business hours:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“this gin index&apos;s pending list fills up roughly once every 2.7 seconds during the peak hours of a normal weekday.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you want to read the full story, GitLab’s Matt Smiley has done an &lt;a href=&quot;https://gitlab.com/gitlab-com/gl-infra/production/-/issues/4725#note_596146675&quot;&gt;excellent analysis of the problem they’ve encountered&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As we can see, getting good data about the actual overhead of GIN pending list updates is critical.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/index-advisor&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor promotion banner&quot; title=&quot;pganalyze Index Advisor promotion banner&quot; src=&quot;https://pganalyze.com/static/7dad04148f9e0117c49a306ff9ab40b1/acb04/promo_index_advisor.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h3 id=&quot;measuring-gin-pending-list-overhead-and-size&quot; &gt;&lt;a href=&quot;#measuring-gin-pending-list-overhead-and-size&quot; aria-label=&quot;measuring gin pending list overhead and size permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Measuring GIN pending list overhead and size&lt;/h3&gt;
&lt;p&gt;To validate whether the GIN pending list is a problem on a busy table, we can do a few things:&lt;/p&gt;
&lt;p&gt;First, we could utilize the &lt;code &gt;pgstatginindex&lt;/code&gt; function together with something like psql’s \watch command to keep a close eye on a particular index:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; EXTENSION pgstattuple&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pgstatginindex&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;myindex&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; version | pending_pages | pending_tuples 
---------+---------------+----------------
       2 |             0 |              0
(1 row)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Second, If you run your own database server, you can use “perf” &lt;a href=&quot;https://wiki.postgresql.org/wiki/Profiling_with_perf#Dynamic_tracepoints&quot;&gt;dynamic tracepoints&lt;/a&gt; to measure calls to the &lt;code &gt;ginInsertCleanup&lt;/code&gt; function in Postgres:&lt;/p&gt;
&lt;div  data-language=&quot;sh&quot;&gt;&lt;pre &gt;&lt;code &gt;sudo perf probe -x /usr/lib/postgresql/14/bin/postgres ginInsertCleanup
sudo perf stat -a -e probe_postgres:ginInsertCleanup -- sleep 60&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;An alternate method, using DTrace, was described in a &lt;a href=&quot;https://www.youtube.com/watch?v=Brt41xnMZqo&amp;#x26;t=1949s&quot;&gt;2019 PGCon talk&lt;/a&gt;. The authors of that talk also ended up visualizing different &lt;code &gt;gin_pending_list_limit&lt;/code&gt; and &lt;code &gt;work_mem&lt;/code&gt; settings:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;DTrace measurements of GIN pending list flushes&quot; title=&quot;DTrace measurements of GIN pending list flushes&quot; src=&quot;https://pganalyze.com/static/5389af77457315017a70d95953877cd4/1d69c/gin_dtrace.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;As they discovered, the memory limit during flushing of the pending list makes a quite noticable difference.&lt;/p&gt;
&lt;p&gt;If you don&apos;t have the luxury of direct access to your database server, you can &lt;a href=&quot;https://gitlab.com/gitlab-com/gl-infra/production/-/issues/4725#note_596146675&quot;&gt;estimate how often the pending list&lt;/a&gt; fills up based on the average size of index tuples and other statistics.&lt;/p&gt;
&lt;p&gt;Now, if we determine that we have a problem, what can we do about it?&lt;/p&gt;
&lt;h3 id=&quot;strategies-for-dealing-with-gin-pending-list-update-issues&quot; &gt;&lt;a href=&quot;#strategies-for-dealing-with-gin-pending-list-update-issues&quot; aria-label=&quot;strategies for dealing with gin pending list update issues permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Strategies for dealing with GIN pending list update issues&lt;/h3&gt;
&lt;p&gt;There are multiple alternate ways you can resolve issues like the one GitLab encountered:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;(1) Reduce &lt;code &gt;gin_pending_list_limit&lt;/code&gt;
&lt;ul&gt;
&lt;li&gt;Have more frequent, smaller flushes&lt;/li&gt;
&lt;li&gt;This may sound odd - but &lt;code &gt;gin_pending_list_limit&lt;/code&gt; started out as being determined by work_mem (instead of being its own setting), and is only configurable separately since Postgres 9.5 - explaining the 4MB default, which may be too high in some cases&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;(2) Increase &lt;code &gt;gin_pending_list_limit&lt;/code&gt;
&lt;ul&gt;
&lt;li&gt;Have more opportunities to cleanup the list outside of the regular workload&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;(3) Turning off &lt;code &gt;fastupdate&lt;/code&gt;
&lt;ul&gt;
&lt;li&gt;Taking the overhead with each individual INSERT/UPDATE&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;(4) Tune autovacuum to run more often on the table, in order to clean the pending list&lt;/li&gt;
&lt;li&gt;(5) Explicitly calling &lt;code &gt;gin_clean_pending_list()&lt;/code&gt;, instead of relying on Autovacuum&lt;/li&gt;
&lt;li&gt;(6) Drop the GIN index
&lt;ul&gt;
&lt;li&gt;If you have alternate ways of indexing the data, for example using expression indexes&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Depending on your workload one or multiple of these approaches could be a good fit.&lt;/p&gt;
&lt;p&gt;In addition, it’s important to ensure you have sufficient memory available during the GIN pending list cleanup. The memory limit used for the pending list flush can be confusing, and is not related to the size of gin_pending_list_limit. Instead it uses the following Postgres settings:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code &gt;work_mem&lt;/code&gt; during regular INSERT/UPDATE&lt;/li&gt;
&lt;li&gt;&lt;code &gt;maintenance_work_mem&lt;/code&gt; during &lt;code &gt;gin_clean_pending_list()&lt;/code&gt; call&lt;/li&gt;
&lt;li&gt;&lt;code &gt;autovacuum_work_mem&lt;/code&gt; during autovacuum&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Last but not least, you may want to consider partitioning or sharding a table that encounters problems like this. It may not be the easiest thing to do, but scaling GIN indexes to heavy write workloads is quite a tricky business.&lt;/p&gt;
&lt;h2 id=&quot;gin-index-support-in-the-pganalyze-index-advisor&quot; &gt;&lt;a href=&quot;#gin-index-support-in-the-pganalyze-index-advisor&quot; aria-label=&quot;gin index support in the pganalyze index advisor permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;GIN index support in the pganalyze Index Advisor&lt;/h2&gt;
&lt;p&gt;Not sure if your workload could utilize a GIN index, or which index to create for your queries?&lt;/p&gt;
&lt;p&gt;We have now added initial support for GIN and GIST index recommendations to the &lt;a src=&quot;https://pganalyze.com/index-advisor&quot;&gt;pganalyze Index Advisor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here is an example of a GIN index recommendation for an existing &lt;code &gt;tsvector&lt;/code&gt; column:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor example with GIN index recommendation&quot; title=&quot;pganalyze Index Advisor example with GIN index recommendation&quot; src=&quot;https://pganalyze.com/static/438113e00dfe6d2bf03fbf617b46b853/1d69c/index_advisor.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Note that the costing and size estimation logic for GIN and GIST indexes is still being actively developed.&lt;/p&gt;
&lt;p&gt;We recommend trying out the Index Advisor recommendation on your own system to assess its effectiveness, as well as monitoring the production table for write overhead after you have added an index. You may also need to tweak your queries to make use of a particular index.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;GIN indexes are powerful, and often the only way to index certain queries and data types. But with great power comes great responsibility. Use GIN indexes wisely, especially on tables that are heavily written to.&lt;/p&gt;
&lt;p&gt;And when you are not sure which GIN index could work, try out the &lt;a href=&quot;https://pganalyze.com/index-advisor&quot;&gt;pganalyze Index Advisor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you want to share this article with your peers, feel free to &lt;a href=&quot;https://twitter.com/intent/tweet?text=%E2%80%9DUnderstanding%20Postgres%20GIN%20Indexes%3A%20The%20Good%20and%20the%20Bad%22%20-%20In%20this%20article,%20%40pganalyze%20shows%20how%20to%20index%20JSONB,%20text%20search%20and%20more%20with%20GIN,%20and%20why%20index%20updates%20can%20get%20expensive%3A%20https://pganalyze.com/blog/gin-index&quot;&gt;tweet it&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;other-helpful-resources&quot; &gt;&lt;a href=&quot;#other-helpful-resources&quot; aria-label=&quot;other helpful resources permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Other helpful resources&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/postgres-create-index&quot;&gt;Using Postgres CREATE INDEX: Understanding operator classes, index types &amp;#x26; more&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/deconstructing-the-postgres-planner&quot;&gt;How we deconstructed the Postgres planner to find indexing opportunities&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/ebooks/efficient-search-in-rails-with-postgres&quot;&gt;Efficient Search in Rails with Postgres (PDF eBook)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/full-text-search-django-postgres&quot;&gt;Efficient Postgres Full Text Search in Django&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/full-text-search-ruby-rails-postgres&quot;&gt;Full Text Search in Milliseconds with Rails and PostgreSQL&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/pagination-django-postgres&quot;&gt;Efficient Pagination in Django and Postgres&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;eBook: Effective Indexing in Postgres&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/webinars/how-to-reason-about-indexing-your-postgres-database&quot;&gt;Webinar: How To Reason About Indexing Your Postgres Database&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/blog/5mins-postgres-for-app-developers-tables-indexes&quot;&gt;5mins of Postgres E17: Demystifying Postgres for application developers: A mental model for tables and indexes&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://pganalyze.com/index-advisor&quot;&gt;pganalyze Index Advisor for Postgres&lt;/a&gt;&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[How we deconstructed the Postgres planner to find indexing opportunities]]></title><description><![CDATA[Everyone who has used Postgres has directly or indirectly used the Postgres planner. The Postgres planner is central to determining how a query gets executed, whether indexes get used, how tables are joined, and more. When Postgres asks itself "How do we run this query?”, the planner answers. And just like Postgres has evolved over decades, the planner has not stood still either. It can sometimes be challenging to understand what exactly the Postgres planner does, and which data it bases its…]]></description><link>https://pganalyze.com/blog/deconstructing-the-postgres-planner</link><guid isPermaLink="false">https://pganalyze.com/blog/deconstructing-the-postgres-planner</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Tue, 02 Nov 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Everyone who has used Postgres has directly or indirectly used the Postgres planner. The Postgres planner is central to determining how a query gets executed, whether indexes get used, how tables are joined, and more. When Postgres asks itself &lt;em&gt;&quot;How do we run this query?”&lt;/em&gt;, the planner answers.&lt;/p&gt;
&lt;p&gt;And just like Postgres has evolved over decades, the planner has not stood still either. &lt;strong&gt;It can sometimes be challenging to understand what exactly the Postgres planner does&lt;/strong&gt;, and which data it bases its decisions on.&lt;/p&gt;
&lt;p&gt;Earlier this year we set out to gain a deep understanding of the planner to improve indexing tools for Postgres. Based on this work we launched the first iteration of the &lt;a src=&quot;https://pganalyze.com/blog/introducing-pganalyze-index-advisor&quot;&gt;pganalyze Index Advisor&lt;/a&gt; over a month ago, and have received an incredible amount of feedback and overall response.&lt;/p&gt;
&lt;p&gt;In this post we take a closer look at &lt;strong&gt;how we extracted the planner into a standalone library&lt;/strong&gt;, just like we did with &lt;a src=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser&quot;&gt;pg_query&lt;/a&gt;. We then assess whether this approach compares to an actually running server, and what is possible now that we can run the planner code. Based on this we look at how we used its decision making know-how to find indexing opportunities, and review the topic of clause selectivity, and how we incorporated feedback by a Postgres community member.&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#planning-a-postgres-query-without-a-running-database-server&quot;&gt;Planning a Postgres query without a running database server&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#how-accurate-is-this-planning-process&quot;&gt;How accurate is this planning process?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#finding-multiple-possible-plan-paths-not-just-the-best-path&quot;&gt;Finding multiple possible plan paths, not just the best path&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#making-index-recommendations-based-on-restriction-clauses&quot;&gt;Making index recommendations based on restriction clauses&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#understanding-postgres-clause-selectivity&quot;&gt;Understanding Postgres clause selectivity&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#how-we-incorporated-postgres-community-feedback&quot;&gt;How we incorporated Postgres community feedback&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#creating-the-best-index-vs-creating-good-enough-indexes&quot;&gt;Creating the best index, vs creating “good enough” indexes&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#join-us-for-design-research-sessions&quot;&gt;Join us for design research sessions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor architecture&quot; title=&quot;pganalyze Index Advisor architecture&quot; src=&quot;https://pganalyze.com/static/0f186601ce07fcb5307920523985884e/1d69c/index_advisor_architecture_short.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;h2 id=&quot;planning-a-postgres-query-without-a-running-database-server&quot; &gt;&lt;a href=&quot;#planning-a-postgres-query-without-a-running-database-server&quot; aria-label=&quot;planning a postgres query without a running database server permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Planning a Postgres query without a running database server&lt;/h2&gt;
&lt;p&gt;At pganalyze we offer performance recommendations for production database systems, without requiring complex installation steps or version upgrades. Whilst Postgres’ extension system is very capable, and we have many ideas on what we could track or do inside Postgres itself, &lt;strong&gt;we intentionally decided not to focus on a Postgres extension for giving index advice&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;There are three top motivations for not creating an extension:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Index decisions often happen during development, where the database that you are working with is not production sized&lt;/li&gt;
&lt;li&gt;Not everyone has direct access to the production database - it’s important we create tooling that can be used by the whole development team&lt;/li&gt;
&lt;li&gt;Adopting a new Postgres extension on a production database is risky, especially if the code is new - and you may not be able to install custom extensions (e.g. on Amazon RDS)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We’ve thus focused on creating something that runs separately from Postgres, but knows how Postgres works. Our approach is inspired by our work on &lt;a src=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser&quot;&gt;pg_query&lt;/a&gt;, and enables planning a query solely based on the query text, the schema definition, and table statistics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;We utilized libclang to automatically extract source code from Postgres&lt;/strong&gt;, &lt;a src=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser#using-libclang-to-extract-c-source-code-from-postgres&quot;&gt;just like we&apos;ve done for pg_query&lt;/a&gt;. Whilst for pg_query we extracted a little bit over 100,000 lines of Postgres source, for the planner we extracted almost 470,000 lines of Postgres source, more than 4x the amount of code. For reference, Postgres itself is almost 1,000,000 lines of source code (as determined by &lt;code &gt;sloccount&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Examples of code from Postgres we didn’t use:&lt;/strong&gt; The executor (except for some initialization routines), the storage subsystem, frontend code, and various specialized code paths.&lt;/p&gt;
&lt;p&gt;A good amount of engineering time later, we ended up with a seemingly simple function in a C library, that takes a query, a schema definition, and returns a result similar to an EXPLAIN plan:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * Plan the provided query utilizing the schema definition and the
 * provided table statistics, and return an EXPLAIN-like result.
 */&lt;/span&gt;
PgPlanResult &lt;span &gt;pg_plan&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;const&lt;/span&gt; &lt;span &gt;char&lt;/span&gt;&lt;span &gt;*&lt;/span&gt; query&lt;span &gt;,&lt;/span&gt; &lt;span &gt;const&lt;/span&gt; &lt;span &gt;char&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;schema_and_statistics&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
  …
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;This function is idempotent&lt;/strong&gt;, that is, when you pass the same set of input parameters, you will always get the same output parameters.&lt;/p&gt;
&lt;p&gt;This required some additional modifications to the extracted code (we have about 90 small patches to adjust certain code paths), especially in places where Postgres does the rare on-demand checking of file sizes, or looking at the B-tree meta page. All of these are instead a fixed input parameter, defined using &lt;code &gt;SET&lt;/code&gt; commands in the schema definition.&lt;/p&gt;
&lt;h3 id=&quot;how-accurate-is-this-planning-process&quot; &gt;&lt;a href=&quot;#how-accurate-is-this-planning-process&quot; aria-label=&quot;how accurate is this planning process permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How accurate is this planning process?&lt;/h3&gt;
&lt;p&gt;Let’s take a look at one of our own test queries:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;WITH&lt;/span&gt; unused_indexes &lt;span &gt;AS&lt;/span&gt; MATERIALIZED &lt;span &gt;(&lt;/span&gt;
  &lt;span &gt;SELECT&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;id&lt;span &gt;,&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;name&lt;span &gt;,&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;last_used_at&lt;span &gt;,&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;database_id&lt;span &gt;,&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;table_id 
    &lt;span &gt;FROM&lt;/span&gt; schema_indexes
         &lt;span &gt;JOIN&lt;/span&gt; schema_tables &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;schema_indexes&lt;span &gt;.&lt;/span&gt;table_id &lt;span &gt;=&lt;/span&gt; schema_tables&lt;span &gt;.&lt;/span&gt;id&lt;span &gt;)&lt;/span&gt;
   &lt;span &gt;WHERE&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;database_id &lt;span &gt;IN&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
         &lt;span &gt;AND&lt;/span&gt; schema_tables&lt;span &gt;.&lt;/span&gt;invalidated_at_snapshot_id &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;
         &lt;span &gt;AND&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;invalidated_at_snapshot_id &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;
         &lt;span &gt;AND&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;is_valid
         &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;NOT&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;is_unique &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;0&lt;/span&gt; &lt;span &gt;&amp;lt;&gt;&lt;/span&gt; &lt;span &gt;ALL&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;schema_indexes&lt;span &gt;.&lt;/span&gt;&lt;span &gt;columns&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
         &lt;span &gt;AND&lt;/span&gt; schema_indexes&lt;span &gt;.&lt;/span&gt;last_used_at &lt;span &gt;&amp;lt;&lt;/span&gt; &lt;span &gt;now&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;-&lt;/span&gt; &lt;span &gt;&apos;14 day&apos;&lt;/span&gt;::&lt;span &gt;interval&lt;/span&gt;
&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;id&lt;span &gt;,&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;name&lt;span &gt;,&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;last_used_at&lt;span &gt;,&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;database_id&lt;span &gt;,&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;table_id 
  &lt;span &gt;FROM&lt;/span&gt; unused_indexes ui
 &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;COALESCE&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;
         &lt;span &gt;SELECT&lt;/span&gt; size_bytes
           &lt;span &gt;FROM&lt;/span&gt; schema_index_stats_35d sis
          &lt;span &gt;WHERE&lt;/span&gt; sis&lt;span &gt;.&lt;/span&gt;schema_index_id &lt;span &gt;=&lt;/span&gt; ui&lt;span &gt;.&lt;/span&gt;id
                &lt;span &gt;AND&lt;/span&gt; collected_at &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;2021-10-31 06:40:04&apos;&lt;/span&gt; &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;&gt;&lt;/span&gt; &lt;span &gt;32768&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This query is used inside the pganalyze application to find indexes that were not in use in the last 14 days. Running &lt;code &gt;EXPLAIN (FORMAT JSON)&lt;/code&gt; for the query on our production system, we get a result like this:&lt;/p&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt; &lt;span &gt;[&lt;/span&gt;
   &lt;span &gt;{&lt;/span&gt;
     &lt;span &gt;&quot;Plan&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
       &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;CTE Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
       …
       &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3172.85&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
       &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3311.01&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; 
       &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;11&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
       &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;60&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
       &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(COALESCE((SubPlan 2), &apos;0&apos;::bigint) &gt; 32768)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
       &lt;span &gt;&quot;Plans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
         &lt;span &gt;{&lt;/span&gt;
           &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Nested Loop&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           …
           &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1.12&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3172.85&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;32&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;63&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; 
           &lt;span &gt;&quot;Inner Unique&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;true&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;Plans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
             &lt;span &gt;{&lt;/span&gt;
               &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Parent Relationship&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Outer&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; 
               …
               &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;index_schema_indexes_on_database_id&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               …
               &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.56&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2581.00&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;69&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;63&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(database_id = 1)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(is_valid AND (NOT is_unique) AND (last_used_at &amp;lt; &apos;2021-10-17&apos;::date) AND (0 &amp;lt;&gt; ALL (columns)))&quot;&lt;/span&gt;
             &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
             &lt;span &gt;{&lt;/span&gt;
               &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Parent Relationship&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Inner&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               …
               &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;schema_tables_pkey&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               …
               &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.56&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;8.58&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;8&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(id = schema_indexes.table_id)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
               &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(invalidated_at_snapshot_id IS NULL)&quot;&lt;/span&gt;
             &lt;span &gt;}&lt;/span&gt;
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that we are intentionally running EXPLAIN without ANALYZE, since we care about the cost-based estimation model used by the planner.&lt;/p&gt;
&lt;p&gt;And now, running the same query, with its schema definition and production statistics (but not the actual table data!) provided to the &lt;code &gt;pg_plan&lt;/code&gt; function:&lt;/p&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;[&lt;/span&gt;
  &lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;&quot;Plan&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
      &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;CTE Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      …
      &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3181.43&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3324.07&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;11&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;60&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(COALESCE((SubPlan 2), &apos;0&apos;::bigint) &gt; 32768)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;Plans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
        &lt;span &gt;{&lt;/span&gt;
          &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Nested Loop&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          …
          &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1.12&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3181.43&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;33&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;63&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Inner Unique&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;true&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;Plans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
            &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Parent Relationship&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Outer&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; 
              …
              &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;index_schema_indexes_on_database_id&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              …
              &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.56&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2581.00&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;70&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;63&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(database_id = 1)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(is_valid AND (NOT is_unique) AND (last_used_at &amp;lt; &apos;2021-10-17&apos;::date) AND (0 &amp;lt;&gt; ALL (columns)))&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Parent Relationship&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Inner&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              …
              &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;schema_tables_pkey&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              …
              &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.56&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;8.58&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Plan Rows&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Plan Width&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;8&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(id = schema_indexes.table_id)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(invalidated_at_snapshot_id IS NULL)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;
            …&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, for this query &lt;strong&gt;the plan cost estimation is within a 1% margin of the actual production estimates&lt;/strong&gt;. That means, we provided the Postgres planner the exact same input parameters as used on the actual database server, and the cost calculation matched almost to the dot.&lt;/p&gt;
&lt;p&gt;Now that we’ve established a basis for running the planner and getting cost estimates, let’s look at what we can do with this.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Effective Indexing eBook promotion banner&quot; title=&quot;Effective Indexing eBook promotion banner&quot; src=&quot;https://pganalyze.com/static/b24fdd95dbc38757fe354c86d9ad9aaa/acb04/promo_ebook.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h3 id=&quot;finding-multiple-possible-plan-paths-not-just-the-best-path&quot; &gt;&lt;a href=&quot;#finding-multiple-possible-plan-paths-not-just-the-best-path&quot; aria-label=&quot;finding multiple possible plan paths not just the best path permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Finding multiple possible plan paths, not just the best path&lt;/h3&gt;
&lt;p&gt;When the Postgres planner plans a query, it is under time-sensitive circumstances. That is, all extra work to find a better plan would lead to the planner itself being slow. To be fast, the planner quickly throws away plan options it does not consider worth pursuing.&lt;/p&gt;
&lt;p&gt;That unfortunately means we can’t just run EXPLAIN with a flag that says “show me all possible plan variants” - the planner code is simply not written in a way that’s possible, at least not today.&lt;/p&gt;
&lt;p&gt;However, with our &lt;code &gt;pg_plan&lt;/code&gt; logic running outside the server itself, we do not have these strict speed requirements, and can therefore spend more time looking at alternatives and keeping them around for analysis. For example, here is the internal information we have for a scan node on a table, that illustrates the different paths that could be taken to fulfill the query:&lt;/p&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt;    &lt;span &gt;&quot;Scans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
      &lt;span &gt;{&lt;/span&gt;
        &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
        &lt;span &gt;&quot;Relation OID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;16398&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
        &lt;span &gt;&quot;Restriction Clauses&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
	      …
        &lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
        &lt;span &gt;&quot;Plans&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;Plan&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;schema_indexes_table_id_name_idx&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              …
              &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.68&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;352.94&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(table_id = id)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(is_valid AND (NOT is_unique) AND (last_used_at &amp;lt; &apos;2021-10-17&apos;::date) AND (database_id = 1) AND (0 &amp;lt;&gt; ALL (columns)))&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;Plan&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Index Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;index_schema_indexes_on_database_id&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              ...
              &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.56&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2581.00&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Index Cond&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(database_id = 1)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(is_valid AND (NOT is_unique) AND (last_used_at &amp;lt; &apos;2021-10-17&apos;::date) AND (0 &amp;lt;&gt; ALL (columns)))&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;Plan&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Node ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Node Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;Seq Scan&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              ...
              &lt;span &gt;&quot;Startup Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.00&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Total Cost&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;3933763.60&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
              &lt;span &gt;&quot;Filter&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;((invalidated_at_snapshot_id IS NULL) AND is_valid AND (NOT is_unique) AND (last_used_at &amp;lt; &apos;2021-10-17&apos;::date) AND (database_id = 1) AND (0 &amp;lt;&gt; ALL (columns)))&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, the &lt;code &gt;Seq Scan&lt;/code&gt; option was clearly more expensive and not considered. You can also see the different index options and their costs.&lt;/p&gt;
&lt;p&gt;What is especially interesting with this plan is that there was actually a cheaper index scan available, but Postgres did not end up using it in the final plan. This is because the Nested Loop ended up being cheaper by using the &lt;code &gt;schema_indexes&lt;/code&gt; table as the outer table in the nested loop. The first index could only have been used if the Nested Loop relationship was inverted. That is, if &lt;code &gt;table_id&lt;/code&gt; values were used as the input to the &lt;code &gt;schema_indexes&lt;/code&gt; scan, instead of &lt;code &gt;table_id&lt;/code&gt; values being the output thats matched against the &lt;code &gt;schema_tables&lt;/code&gt; table&apos;s &lt;code &gt;id&lt;/code&gt; column.&lt;/p&gt;
&lt;p&gt;As you can see, this data can be especially useful when determining why a particular index wasn’t used, or to consider how to consolidate indexes. In the pganalyze Index Advisor this is surfaced visually in the advanced analysis view:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor advanced analysis&quot; title=&quot;pganalyze Index Advisor advanced analysis&quot; src=&quot;https://pganalyze.com/static/c201ca2f20ace7ff5e151ea54eaf52d4/1d69c/index_advisor_index_options.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Note we also indicate the individual filter clauses for the scan, and show which indexes are matching each clause.&lt;/p&gt;
&lt;h3 id=&quot;making-index-recommendations-based-on-restriction-clauses&quot; &gt;&lt;a href=&quot;#making-index-recommendations-based-on-restriction-clauses&quot; aria-label=&quot;making index recommendations based on restriction clauses permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Making index recommendations based on restriction clauses&lt;/h3&gt;
&lt;p&gt;In addition to comparing different existing indexes, we can use the data available to the planner to ask the question &lt;em&gt;“What would the best index look like?”&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;For a scan like the above example, we get a list of restriction clauses, which is a combination of the WHERE clauses as well as the JOIN condition. For index scans to work as expected, one or more of the clauses need to match the index definition.&lt;/p&gt;
&lt;p&gt;The data looks like this for each scan:&lt;/p&gt;
&lt;div  data-language=&quot;json&quot;&gt;&lt;pre &gt;&lt;code &gt;        &lt;span &gt;&quot;Restriction Clauses&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Expression&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;schema_indexes.is_valid&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Selectivity&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.9926&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Relation Column&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;is_valid&quot;&lt;/span&gt;
          &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;
            &lt;span &gt;&quot;ID&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;2&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Expression&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;(schema_indexes.database_id = 1)&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Selectivity&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0.0001&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;OpExpr&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
              &lt;span &gt;&quot;Operator&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
                &lt;span &gt;&quot;Oid&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;416&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;&quot;Name&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;=&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;&quot;Left Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;bigint&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;&quot;Right Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;integer&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;&quot;Result Type&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;boolean&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
                &lt;span &gt;&quot;Source Func&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;int84eq&quot;&lt;/span&gt;
              &lt;span &gt;}&lt;/span&gt;
            &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
            &lt;span &gt;&quot;Relation Column&quot;&lt;/span&gt;&lt;span &gt;:&lt;/span&gt; &lt;span &gt;&quot;database_id&quot;&lt;/span&gt;
          &lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
         ...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Using this data we then attempt a best guess at making a new index, run a &lt;code &gt;CREATE INDEX&lt;/code&gt; command behind the scenes, and re-run the Postgres planner to reconsider the new index. If the cost of the new scan improves on the initial scan we make a recommendation and note the difference in estimated cost.&lt;/p&gt;
&lt;p&gt;In summary, you can imagine the Index Advisor working roughly like this:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor architecture&quot; title=&quot;pganalyze Index Advisor architecture&quot; src=&quot;https://pganalyze.com/static/9ca071020e082c267e1de228e0ba4727/1d69c/index_advisor_architecture.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;h2 id=&quot;understanding-postgres-clause-selectivity&quot; &gt;&lt;a href=&quot;#understanding-postgres-clause-selectivity&quot; aria-label=&quot;understanding postgres clause selectivity permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Understanding Postgres clause selectivity&lt;/h2&gt;
&lt;p&gt;If you look closely at the earlier advanced analysis screenshot, you will notice a new field that we’ve just made available in a new Index Advisor update: &lt;strong&gt;Selectivity&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What is Selectivity?&lt;/strong&gt; It indicates what fraction of rows of the table will be matched by the particular clause of the query. This information is then used by Postgres to estimate the row count that a node returns, as well as determine the cost of that plan node.&lt;/p&gt;
&lt;p&gt;Selectivity estimations are front and center to how the planner operates, but they are unfortunately hidden behind the scenes, and historically one would have had to resort to counting/filtering the actual data to confirm how frequent certain values are, or do manual queries against the Postgres catalog.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How does the planner know the selectivity?&lt;/strong&gt; Counting actual table rows would be very expensive in time sensitive situations. Instead, it primarily relies on the &lt;code &gt;pg_statistic&lt;/code&gt; table (often accessed through the &lt;code &gt;pg_stats&lt;/code&gt; view for debugging), that keeps table statistics collected by the &lt;code &gt;ANALYZE&lt;/code&gt; command in Postgres. You can learn more about how the Postgres planner uses statistics in the &lt;a href=&quot;https://www.postgresql.org/docs/current/planner-stats.html&quot;&gt;Postgres documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The data in the &lt;code &gt;pg_stats&lt;/code&gt; view can be queried like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stats &lt;span &gt;WHERE&lt;/span&gt; tablename &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;z&apos;&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; attname &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;a&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;-[ RECORD 1 ]----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | public
tablename              | z
attname                | a
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | 17
most_common_vals       | {2,3,7,12,13,4,1,5,11,14,9,6,10,8,15,16,0}
most_common_freqs      | {0.0653,0.06446667,0.063766666,0.06363333,0.063533336,0.063433334,0.0629,0.061966665,0.061833333,0.0618,0.0611,0.0605,0.0604,0.060366668,0.059666667,0.0332,0.032133333}
histogram_bounds       | 
correlation            | 0.061594862
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this example you can see that there are a total of 17 distinct values (&lt;code &gt;n_distinct&lt;/code&gt;), with values between 1 and 15 having equal frequency, and 0 and 16 being less frequent (&lt;code &gt;most_common_vals&lt;/code&gt;/&lt;code &gt;most_common_freqs&lt;/code&gt;). None of the rows have NULL values (&lt;code &gt;null_frac&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Now, to have accurate plans in the Index Advisor, this same information can be provided using the new special SET commands in the schema definition:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;avg_width&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;4&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;correlation&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;0.061594862&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;most_common_freqs&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;{0.0653,0.06446667,0.063766666,0.06363333,0.063533336,0.063433334,0.0629,0.061966665,0.061833333,0.0618,0.0611,0.0605,0.0604,0.060366668,0.059666667,0.0332,0.032133333}&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;most_common_vals&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;{2,3,7,12,13,4,1,5,11,14,9,6,10,8,15,16,0}&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;n_distinct&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;17&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SET&lt;/span&gt; pganalyze&lt;span &gt;.&lt;/span&gt;null_frac&lt;span &gt;.&lt;/span&gt;&lt;span &gt;public&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;z&lt;span &gt;.&lt;/span&gt;a &lt;span &gt;=&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can learn how to retrieve this information, as well as all the available settings, in the &lt;a src=&quot;https://pganalyze.com/docs/index-advisor/standalone/settings&quot;&gt;Index Advisor documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Based on this data we can now calculate the selectivity of a clause like &lt;code &gt;z.a = 12&lt;/code&gt; to determine that it is &lt;code &gt;0.0636&lt;/code&gt;. Or put differently, the planner estimates that 6.36% of the table would match this condition. This same information is now directly visible in the Index Advisor, when viewing the advanced analysis.&lt;/p&gt;
&lt;h3 id=&quot;how-we-incorporated-postgres-community-feedback&quot; &gt;&lt;a href=&quot;#how-we-incorporated-postgres-community-feedback&quot; aria-label=&quot;how we incorporated postgres community feedback permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How we incorporated Postgres community feedback&lt;/h3&gt;
&lt;p&gt;At this point we’d also like to give a shout-out to Hubert Lubaczewski (aka “depesz”), who &lt;a href=&quot;https://www.depesz.com/2021/10/22/why-is-it-hard-to-automatically-suggest-what-index-to-create/&quot;&gt;reviewed the initial version of the index advisor&lt;/a&gt;, had some critical feedback, and provided an example we could investigate further.&lt;/p&gt;
&lt;p&gt;Based on improvements we&apos;ve done, we now take selectivity estimates into account for index suggestions. In particular, we give priority to columns with low selectivity, i.e. those that match a small number of rows. Note this requires use of &lt;code &gt;SET&lt;/code&gt; commands in addition to the raw schema data for the best results.&lt;/p&gt;
&lt;p&gt;With these recent changes the pganalyze Index Advisor recommendation matches depesz&apos;s handcrafted index suggestion in the blog post:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Example result based on clause selectivity&quot; title=&quot;Example result based on clause selectivity&quot; src=&quot;https://pganalyze.com/static/199517e98ae9ffc34d5544934e7b0b13/1d69c/example_selectivity.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;This is a good example of how our planner-based index advisor approach can be improved and tuned, as its behavior is modeled on Postgres itself.&lt;/p&gt;
&lt;h2 id=&quot;creating-the-best-index-vs-creating-good-enough-indexes&quot; &gt;&lt;a href=&quot;#creating-the-best-index-vs-creating-good-enough-indexes&quot; aria-label=&quot;creating the best index vs creating good enough indexes permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Creating the best index, vs creating “good enough” indexes&lt;/h2&gt;
&lt;p&gt;Another question that came up in multiple conversations, is &lt;em&gt;“Should I just create all indexes that the Index Advisor recommends for each query?”&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Unless you have just a handful of queries, the answer to that is no - you shouldn’t just create every index, because that would slow down writes to the table, as they have to update each index separately.&lt;/p&gt;
&lt;p&gt;Today, the best way to utilize the index advisor for a whole database, is to try out different CREATE INDEX statements - and make sure to update the schema definition with your index definition, to have the Index Advisor make a determination based on the existing indexes.&lt;/p&gt;
&lt;p&gt;But we are taking this a step further. The work we are currently doing in this area is focused on two aspects:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Utilize query workload data from &lt;code &gt;pg_stat_statements&lt;/code&gt; to weigh common queries heavier in index recommendations, and come up with “good enough” indexes that cover more queries&lt;/li&gt;
&lt;li&gt;Estimate the write overhead of a new index, based on the number of updates/deletes/inserts on a table, as well as the estimated index size&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With this, not only are we targeting better summary recommendations, but we also want to help you determine when you can consolidate indexes, where you have two existing very similar indexes.&lt;/p&gt;
&lt;p&gt;Curious to learn more? Sign up to join us for a design research session:&lt;/p&gt;
&lt;h3 id=&quot;join-us-for-design-research-sessions&quot; &gt;&lt;a href=&quot;#join-us-for-design-research-sessions&quot; aria-label=&quot;join us for design research sessions permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Join us for design research sessions&lt;/h3&gt;
&lt;p&gt;If you are up for testing early prototypes and answering questions to help us understand your workflows better, then we would like to invite you to our design research sessions:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://www.userinterviews.com/projects/C9tzb5mnxA/apply&quot;&gt;pganalyze Design Research Sign-Up&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Realistically, there will always be a trial-and-error aspect in making indexing decisions. But good tools can help guide you in those decisions, no matter your level of Postgres know-how. Our goal with the &lt;a src=&quot;https://pganalyze.com/index-advisor&quot;&gt;pganalyze Index Advisor&lt;/a&gt; is to make indexing an activity that can be done by the whole team, and where it’s easy to get a &lt;code &gt;CREATE INDEX&lt;/code&gt; statement to start working from.&lt;/p&gt;
&lt;p&gt;As you see, the Index Advisor is based on the core logic of Postgres itself, and that forms the basis for making complex assessments behind the scenes. We believe in an iterative process and sharing what we’ve learned, and hope to continue the conversation on how to make indexing better for Postgres.&lt;/p&gt;
&lt;p&gt;We&apos;ve recently made a number of updates to the Index Advisor. Give it a try, and use the &lt;a src=&quot;https://pganalyze.com/docs/index-advisor/standalone/settings&quot;&gt;new SET table statistics syntax&lt;/a&gt; for best results. Encounter an issue with the Index Advisor? You can provide feedback through our &lt;a href=&quot;https://github.com/pganalyze/index-advisor-feedback/discussions&quot;&gt;dedicated discussion board on GitHub&lt;/a&gt;, or send us a support request for in-app functionality.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/index-advisor&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor promotion banner&quot; title=&quot;pganalyze Index Advisor promotion banner&quot; src=&quot;https://pganalyze.com/static/7dad04148f9e0117c49a306ff9ab40b1/acb04/promo_index_advisor.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;If you want to share this article with your peers, feel free to &lt;a href=&quot;https://twitter.com/intent/tweet?text=%E2%80%9DHow%20we%20deconstructed%20the%20Postgres%20planner%20to%20find%20indexing%20opportunities%22%20-%20In%20this%20article,%20%40pganalyze%20shares%20how%20they%20built%20their%20new%20index%20advisor%20for%20%23Postgres%20and%20how%20they%20run%20the%20Postgres%20planner%20as%20a%20library%3A%20https://pganalyze.com/blog/deconstructing-the-postgres-planner&quot;&gt;tweet it&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[A better way to index your Postgres database: pganalyze Index Advisor]]></title><description><![CDATA[When you run an application with a relational database attached, you will no doubt have encountered this question: Which indexes should I create? For some of us, indexing comes naturally, and B-tree, GIN and GIST are words of everyday use. And for some of us it’s more challenging to find out which index to create, taking a lot of time to get right. But what unites us is that creating and tweaking indexes is part of our job when we use a relational database such as Postgres in production. We need…]]></description><link>https://pganalyze.com/blog/introducing-pganalyze-index-advisor</link><guid isPermaLink="false">https://pganalyze.com/blog/introducing-pganalyze-index-advisor</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 23 Sep 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Screenshot of the new pganalyze Index Advisor&quot; title=&quot;Screenshot of the new pganalyze Index Advisor&quot; src=&quot;https://pganalyze.com/static/8b658545c93873c38a1d2fb2d9be699d/1d69c/header-image.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;When you run an application with a relational database attached, you will no doubt have encountered this question: Which indexes should I create?&lt;/p&gt;
&lt;p&gt;For some of us, indexing comes naturally, and B-tree, GIN and GIST are words of everyday use. And for some of us it’s more challenging to &lt;a href=&quot;https://pganalyze.com/blog/postgres-create-index&quot;&gt;find out which index to create&lt;/a&gt;, taking a lot of time to get right. But what unites us is that creating and tweaking indexes is part of our job when we use a relational database such as Postgres in production. We need to get indexes right, in order to make sure our application performs well.&lt;/p&gt;
&lt;p&gt;There are multiple ways to determine which indexes get used in your Postgres database. For example, you may choose to query the &lt;code &gt;pg_stat_user_indexes&lt;/code&gt; table. There are Postgres extensions like HypoPG to try out hypothetical indexes on your database server. And some of us may decide to go ahead and simply &lt;a href=&quot;https://twitter.com/craigkerstiens/status/851817428833009664&quot;&gt;index every column on every table&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But the reality nowadays is that modern apps are complex, and applications built on Postgres grow at an incredible pace. This makes indexing more important, but also more challenging than ever. As developers we want to focus on what matters, and not spend hours investigating which Postgres index to create.&lt;/p&gt;
&lt;p&gt;At the beginning of this year we set out to improve the status quo for indexing with Postgres. And today, after many months of effort and having published an &lt;a href=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;eBook about Index in Postgres&lt;/a&gt;, we’re excited to announce the new &lt;a href=&quot;https://pganalyze.com/postgres-index-advisor&quot;&gt;pganalyze Index Advisor for Postgres&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Before we dive into all the details, let’s take a step back and ask ourselves “How could we determine which index to create?”&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#postgres-indexing-is-machine-learning-the-answer&quot;&gt;Postgres Indexing: Is machine learning the answer?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#how-postgres-determines-when-to-use-an-index&quot;&gt;How Postgres determines when to use an index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#creating-the-best-postgres-index-for-your-query&quot;&gt;Creating the best Postgres index for your query&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#review-existing-indexes-with-the-index-advisor&quot;&gt;Review existing indexes with the Index Advisor&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#try-out-the-index-advisor-for-free-with-the-standalone-tool&quot;&gt;Try out the Index Advisor for free with the standalone tool&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#automatic-index-advisor-for-your-production-queries-in-pganalyze&quot;&gt;Automatic index advisor for your production queries in pganalyze&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#pganalyze-index-advisor-and-new-pricing-plans&quot;&gt;pganalyze Index Advisor and new pricing plans&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;postgres-indexing-is-machine-learning-the-answer&quot; &gt;&lt;a href=&quot;#postgres-indexing-is-machine-learning-the-answer&quot; aria-label=&quot;postgres indexing is machine learning the answer permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Postgres Indexing: Is machine learning the answer?&lt;/h2&gt;
&lt;p&gt;It’s 2021, and of course we had to ask ourselves - is this a problem that requires ML and AI? Couldn’t we just train a model to create the right indexes for us?&lt;/p&gt;
&lt;p&gt;We turned to GitHub CoPilot, the most sophisticated AI-based helper that exists today for developers, and asked it to create an index for a real world query in our own Postgres database:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Screenshot of GitHub CoPilot trying to recommend an index&quot; title=&quot;Screenshot of GitHub CoPilot trying to recommend an index&quot; src=&quot;https://pganalyze.com/static/dd9c227bc19c128d86e050fe701192c8/1d69c/copilot.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Suffice to say that indexing like this is &lt;strong&gt;not effective&lt;/strong&gt;. You will end up with significant overhead due to indexing almost everything, including columns that are not even referenced in the query.&lt;/p&gt;
&lt;p&gt;Whilst this ML model will certainly improve, and there is research on more purpose-built solutions for databases, the point is: &lt;strong&gt;ML is not the magic solution we are looking for.&lt;/strong&gt; We need more than just machine learning to know which indexes to create.&lt;/p&gt;
&lt;p&gt;In fact, from our own experience, knowing which index to create does not require an ML model at all. Knowing how to create the best index can be done with a &lt;strong&gt;deterministic approach&lt;/strong&gt;, that takes into account production database queries and schema statistics, and has a detailed understanding of how Postgres works.&lt;/p&gt;
&lt;p&gt;And who knows best how Postgres works? Postgres itself!&lt;/p&gt;
&lt;h2 id=&quot;how-postgres-determines-when-to-use-an-index&quot; &gt;&lt;a href=&quot;#how-postgres-determines-when-to-use-an-index&quot; aria-label=&quot;how postgres determines when to use an index permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How Postgres determines when to use an index&lt;/h2&gt;
&lt;p&gt;We started out by asking ourselves the question: How does Postgres decide which index to use? We can find this logic in the &lt;a href=&quot;https://www.postgresql.org/docs/current/planner-optimizer.html&quot;&gt;Postgres planner&lt;/a&gt;, which takes a parsed query and turns it into an execution plan.&lt;/p&gt;
&lt;p&gt;Specifically, we decided to look at the function create_index_paths(..), where you can see that Postgres loops over all indexes on a particular table, and decides which indexes can be used:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;void&lt;/span&gt;
&lt;span &gt;create_index_paths&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PlannerInfo &lt;span &gt;*&lt;/span&gt;root&lt;span &gt;,&lt;/span&gt; RelOptInfo &lt;span &gt;*&lt;/span&gt;rel&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
	&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;

	&lt;span &gt;/* Skip the whole mess if no indexes */&lt;/span&gt;
	&lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;rel&lt;span &gt;-&gt;&lt;/span&gt;indexlist &lt;span &gt;==&lt;/span&gt; NIL&lt;span &gt;)&lt;/span&gt;
		&lt;span &gt;return&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

	&lt;span &gt;/* Bitmap paths are collected and then dealt with at the end */&lt;/span&gt;
	bitindexpaths &lt;span &gt;=&lt;/span&gt; bitjoinpaths &lt;span &gt;=&lt;/span&gt; joinorclauses &lt;span &gt;=&lt;/span&gt; NIL&lt;span &gt;;&lt;/span&gt;

	&lt;span &gt;/* Examine each index in turn */&lt;/span&gt;
	&lt;span &gt;foreach&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;lc&lt;span &gt;,&lt;/span&gt; rel&lt;span &gt;-&gt;&lt;/span&gt;indexlist&lt;span &gt;)&lt;/span&gt;
	&lt;span &gt;{&lt;/span&gt;
		IndexOptInfo &lt;span &gt;*&lt;/span&gt;index &lt;span &gt;=&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;IndexOptInfo &lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;lfirst&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;lc&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

		&lt;span &gt;/*
		 * Ignore partial indexes that do not match the query.
		 * (generate_bitmap_or_paths() might be able to do something with
		 * them, but that&apos;s of no concern here.)
		 */&lt;/span&gt;
		&lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;index&lt;span &gt;-&gt;&lt;/span&gt;indpred &lt;span &gt;!=&lt;/span&gt; NIL &lt;span &gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span &gt;!&lt;/span&gt;index&lt;span &gt;-&gt;&lt;/span&gt;predOK&lt;span &gt;)&lt;/span&gt;
			&lt;span &gt;continue&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
   &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Going into all this logic would likely fill multiple books, and it is based on decades of academic research. Cleary, Postgres is very sophisticated about determining which indexes can be used for a given query. Amongst the core decisions it makes are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Does the index match the columns used in the query?&lt;/li&gt;
&lt;li&gt;Does the query’s operator match the operator class of the index?&lt;/li&gt;
&lt;li&gt;Does the index have a sort order that can be used by the query to avoid an explicit Sort step?&lt;/li&gt;
&lt;li&gt;Does the query condition match a partial index condition?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And many other requirements and heuristics that need extensive knowledge of Postgres’ inner workings.&lt;/p&gt;
&lt;p&gt;At pganalyze, we looked at this, and other functions, and we asked ourselves: &lt;em&gt;&lt;strong&gt;What if we used the Postgres planner to tell us which index it would like to see, based on a given query?&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;That is, instead of asking “does this index match this query?”, we are asking “what’s the perfect index for this query?”. Perfect as in: ticks all the boxes in terms of operators/operator classes, columns and data types, and can be used to fulfill the query filter and join clauses of the query, if possible.&lt;/p&gt;
&lt;p&gt;This logic based on the Postgres planner is the centerpiece of the new &lt;a href=&quot;https://pganalyze.com/postgres-index-advisor&quot;&gt;pganalyze Index Advisor&lt;/a&gt;. Our index advisor is available in the pganalyze app, but we also decided to provide a &lt;strong&gt;free, standalone version available to anyone&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Simply paste your query and schema data and get insights on whether existing indexes are useful, or learn why indexes you thought might help are ignored. Note that data uploaded to the standalone pganalyze Index Advisor stays local within your browser, unless you explicitly use the share functionality.&lt;/p&gt;
&lt;p&gt;Going forward in this article, when you see examples and screenshots of the index advisor for Postgres, we are showing the public, standalone tool.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Effective Indexing eBook promotion banner&quot; title=&quot;Effective Indexing eBook promotion banner&quot; src=&quot;https://pganalyze.com/static/b24fdd95dbc38757fe354c86d9ad9aaa/acb04/promo_ebook.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id=&quot;creating-the-best-postgres-index-for-your-query&quot; &gt;&lt;a href=&quot;#creating-the-best-postgres-index-for-your-query&quot; aria-label=&quot;creating the best postgres index for your query permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Creating the best Postgres index for your query&lt;/h2&gt;
&lt;p&gt;Let’s go back to our earlier example, and run it through the pganalyze Index Advisor:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor Example showing better recommendation than GitHub CoPilot&quot; title=&quot;pganalyze Index Advisor Example showing better recommendation than GitHub CoPilot&quot; src=&quot;https://pganalyze.com/static/4c827dde6b120965840518cb121d4b51/1d69c/issues_example.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;As you can see, we get a recommendation for a single multi column index that covers all columns that are in the WHERE clause, except for the column that’s inside the OR condition. This is the best index that we can create to ensure the query runs fast.&lt;/p&gt;
&lt;p&gt;At launch the index advisor is focused on recommending B-tree indexes, with support for other index types coming soon.&lt;/p&gt;
&lt;p&gt;Note that the index advisor also understands common query patterns like filtering out records based on &lt;code &gt;deleted_at&lt;/code&gt; column, and recommends partial indexes for these queries:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Example of a partial index in pganalyze Index Advisor&quot; title=&quot;Example of a partial index in pganalyze Index Advisor&quot; src=&quot;https://pganalyze.com/static/9178eaa81f801c6e25b500d46bca9027/1d69c/partial_index_example.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;h2 id=&quot;review-existing-indexes-with-the-index-advisor&quot; &gt;&lt;a href=&quot;#review-existing-indexes-with-the-index-advisor&quot; aria-label=&quot;review existing indexes with the index advisor permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Review existing indexes with the Index Advisor&lt;/h2&gt;
&lt;p&gt;The pganalyze Index Advisor is also able to determine how different existing indexes perform, to help you understand which index Postgres will most likely use.&lt;/p&gt;
&lt;p&gt;For example, imagine a schema and index definition like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; events&lt;span &gt;(&lt;/span&gt;
  id bigserial &lt;span &gt;PRIMARY&lt;/span&gt; &lt;span &gt;KEY&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  created_at timestamptz&lt;span &gt;,&lt;/span&gt;
  severity &lt;span &gt;smallint&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  organization_id &lt;span &gt;bigint&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  description &lt;span &gt;text&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
  details jsonb
&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; events&lt;span &gt;(&lt;/span&gt;organization_id&lt;span &gt;,&lt;/span&gt; severity&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We want to understand how effective this index is for queries that only query the “severity” column, without looking up a particular organization.&lt;/p&gt;
&lt;p&gt;With the index advisor, we can see the cost difference between the indexes, and that Postgres prefers using the single-column index in most situations:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Example of an existing index comparison in pganalyze Index Advisor&quot; title=&quot;Example of an existing index comparison in pganalyze Index Advisor&quot; src=&quot;https://pganalyze.com/static/116afeaf141d2fe48d8d6ca43b303d41/1d69c/existing_index.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;This can be explained by the fact that single-column indexes are usually smaller, and it’s more efficient, especially in older Postgres releases, to find index records when the queried column is listed first in the column list. You may still choose to use a multi-column index, but this helps you understand the trade-off.&lt;/p&gt;
&lt;p&gt;
&lt;a src=&quot;https://pganalyze.com/index-advisor&quot;&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;pganalyze Index Advisor promotion banner&quot; title=&quot;pganalyze Index Advisor promotion banner&quot; src=&quot;https://pganalyze.com/static/7dad04148f9e0117c49a306ff9ab40b1/acb04/promo_index_advisor.jpg&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id=&quot;try-out-the-index-advisor-for-free-with-the-standalone-tool&quot; &gt;&lt;a href=&quot;#try-out-the-index-advisor-for-free-with-the-standalone-tool&quot; aria-label=&quot;try out the index advisor for free with the standalone tool permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Try out the Index Advisor for free with the standalone tool&lt;/h2&gt;
&lt;p&gt;Want to try out the index advisor yourself? As mentioned above, we developed a standalone version of the index advisor that runs fully in your web browser, powered by our self-contained Postgres planner compiled to WebAssembly.&lt;/p&gt;
&lt;p&gt;You can simply go to &lt;a href=&quot;https://pganalyze.com/index-advisor&quot;&gt;https://pganalyze.com/index-advisor&lt;/a&gt;, paste your query and schema, and get your recommendations. If you don’t have a query and schema ready, for example because you are reading this on your mobile phone, you can take a look at how it works with a set of examples we added for your convenience.&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Standalone pganalyze Index Advisor tool&quot; title=&quot;Standalone pganalyze Index Advisor tool&quot; src=&quot;https://pganalyze.com/static/6ff47e98c45f1c5dbfb29ee1753fc2c0/1d69c/standalone_start.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;We’ve also ensured that the standalone tool is ready for collaboration. If you want to share index recommendations with your team, simply click the [Share] button. After you confirm, this uploads the result of the index advisor to the pganalyze servers for sharing, and gives you a unique URL to share. Note that unless you share, all data stays local within your web browser.&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Share function of standalone pganalyze Index Advisor tool&quot; title=&quot;Share function of standalone pganalyze Index Advisor tool&quot; src=&quot;https://pganalyze.com/static/968b5ff0baa5f327151a350d2e2f3921/1d69c/standalone_share.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;Of course, copying query texts can be tedious and a lot of work. But, if you are a &lt;a href=&quot;https://pganalyze.com&quot;&gt;pganalyze&lt;/a&gt; customer, we already have your query information in our app. The second part of today’s launch is about the new &lt;strong&gt;in-app pganalyze Index Advisor&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&quot;automatic-index-advisor-for-your-production-queries-in-pganalyze&quot; &gt;&lt;a href=&quot;#automatic-index-advisor-for-your-production-queries-in-pganalyze&quot; aria-label=&quot;automatic index advisor for your production queries in pganalyze permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Automatic index advisor for your production queries in pganalyze&lt;/h2&gt;
&lt;p&gt;With the new Index Advisor in pganalyze, you can now see at a glance what index recommendations exist for each of your queries. You can simply go to the query details page for your queries, and see what the Index Advisor recommends:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;In-app screenshot of pganalyze Index Advisor&quot; title=&quot;In-app screenshot of pganalyze Index Advisor&quot; src=&quot;https://pganalyze.com/static/9716fe29cfa7a4c02ff8c358f54cdd98/1d69c/in_app_advisor.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;This is really nice, but we already have work underway to help you get an even better assessment of index usage summarized &lt;strong&gt;across your whole database&lt;/strong&gt;. But more on that soon (sign up for the newsletter if you want to get updates about this).&lt;/p&gt;
&lt;h2 id=&quot;pganalyze-index-advisor-and-new-pricing-plans&quot; &gt;&lt;a href=&quot;#pganalyze-index-advisor-and-new-pricing-plans&quot; aria-label=&quot;pganalyze index advisor and new pricing plans permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;pganalyze Index Advisor and new pricing plans&lt;/h2&gt;
&lt;p&gt;The pganalyze Index Advisor represents a significant improvement to the core functionality of pganalyze, and introduces additional sophisticated processing for each query received by pganalyze. We are therefore taking this moment to introduce both a &lt;a src=&quot;https://pganalyze.com/pricing&quot;&gt;new Production and a new Scale plan&lt;/a&gt;. In addition to the &lt;strong&gt;Index Advisor&lt;/strong&gt;, the new Scale plan also features &lt;strong&gt;SAML-based Single Sign On&lt;/strong&gt; in early access, to integrate with identity providers such as Okta.&lt;/p&gt;
&lt;p&gt;If you are an existing pganalyze customer on (what is now) a legacy plan you can try out the Index Advisor until the end of October 2021. Trying out the Index Advisor requires no changes to your existing pganalyze integration.&lt;/p&gt;
&lt;p&gt;If you do not have an account with us at the moment but sign up for a new trial the pganalyze Index Advisor will be activated for your 14-day trial. Try it out today in the pganalyze app, or &lt;a href=&quot;https://app.pganalyze.com/users/sign_up&quot;&gt;start a new trial&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;All of us at pganalyze are excited to share the new pganalyze Index Advisor with you. Try out the standalone tool or explore the new in-app functionality today. We hope the standalone tool is a service you will come back to time and again and get value out of it. Feel free to bookmark it!&lt;/p&gt;
&lt;p&gt;You can provide feedback through our &lt;a href=&quot;https://github.com/pganalyze/index-advisor-feedback/discussions&quot;&gt;dedicated discussion board on GitHub&lt;/a&gt;, or send us a support request for in-app functionality. We look forward to hearing from you.&lt;/p&gt;
&lt;p&gt;If you want to share this article with your peers, feel free to &lt;a href=&quot;https://twitter.com/intent/tweet?text=%E2%80%9DA%20better%20way%20for%20indexing%20your%20Postgres%20database%22%20-%20In%20this%20article,%20%40pganalyze%20share%20how%20they%20approached%20building%20their%20new%20index%20advisor%20for%20%23Postgres%20and%20give%20you%20free%20access%20to%20it%3A%20https://pganalyze.com/blog/introducing-pganalyze-index-advisor&quot;&gt;tweet it&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Using Postgres CREATE INDEX: Understanding operator classes, index types & more]]></title><description><![CDATA[Most developers working with databases know the challenge: New code gets deployed to production, and suddenly the application is slow. We investigate, look at our APM tools and our database monitoring, and we find out that the new code caused a new query to be issued. We investigate further, and discover the query is not able to use an index. But what makes an index usable by a query, and how can we add the right index in Postgres? In this post we’ll look at the practical aspects of using the…]]></description><link>https://pganalyze.com/blog/postgres-create-index</link><guid isPermaLink="false">https://pganalyze.com/blog/postgres-create-index</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 12 Aug 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Most developers working with databases know the challenge: New code gets deployed to production, and suddenly the application is slow. We investigate, look at our APM tools and our database monitoring, and we find out that the new code caused a new query to be issued. We investigate further, and discover the query is not able to use an index.&lt;/p&gt;
&lt;p&gt;But what makes an index usable by a query, and how can we add the right index in Postgres?&lt;/p&gt;
&lt;p&gt;In this post we’ll look at the practical aspects of using the &lt;code &gt;CREATE INDEX&lt;/code&gt; command, as well as how you can &lt;strong&gt;analyze a PostgreSQL query for its operators and data types&lt;/strong&gt;, so you can choose the best index definition.&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/b276e42dae661e98b7fbb885bd7609ac/aa440/postgres-create-index.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Representation of a Postgres query compared to the matching index definition&quot; title=&quot;Representation of a Postgres query compared to the matching index definition&quot; src=&quot;https://pganalyze.com/static/b276e42dae661e98b7fbb885bd7609ac/1d69c/postgres-create-index.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#how-do-you-create-an-index-in-postgres&quot;&gt;How do you create an index in Postgres?&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#parse-analysis-how-postgres-interprets-your-query&quot;&gt;Parse analysis: How Postgres interprets your query&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#looking-behind-the-scenes-operators-and-data-types&quot;&gt;Looking behind the scenes: Operators and data types&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#finding-the-right-index-type&quot;&gt;Finding the right index type&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#specifying-operator-classes-during-create-index&quot;&gt;Specifying operator classes during CREATE INDEX&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#specifying-multiple-columns-when-adding-a-postgres-index&quot;&gt;Specifying multiple columns when adding a Postgres index&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#using-functions-and-expressions-in-an-index-definition&quot;&gt;Using functions and expressions in an index definition&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#specifying-a-where-clause-to-create-partial-postgresql-indexes&quot;&gt;Specifying a WHERE clause to create partial PostgreSQL indexes&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#using-include-to-create-a-covering-index-for-index-only-scans&quot;&gt;Using INCLUDE to create a covering index for Index-Only Scans&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#adding-and-dropping-postgresql-indexes-safely-on-production&quot;&gt;Adding and dropping PostgreSQL indexes safely on production&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;how-do-you-create-an-index-in-postgres&quot; &gt;&lt;a href=&quot;#how-do-you-create-an-index-in-postgres&quot; aria-label=&quot;how do you create an index in postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How do you create an index in Postgres?&lt;/h2&gt;
&lt;p&gt;Before we dive into the internals, let’s set the stage and look at the most basic way of creating an index in Postgres. The essence of adding an index is this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;table&lt;/span&gt;&lt;span &gt;]&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;column1&lt;span &gt;]&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For an actual example, let’s say we have a query on our users table that looks for a particular email address:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; users &lt;span &gt;WHERE&lt;/span&gt; users&lt;span &gt;.&lt;/span&gt;email &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;test@example.com&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can see this query is searching for values in the “email” column - so the index we should create is on that particular column:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When we run this command, Postgres will create an index for us.&lt;/p&gt;
&lt;p&gt;It&apos;s important to &lt;strong&gt;remember that indexes are redundant data structures&lt;/strong&gt;. If you drop an index you don&apos;t lose any data. The primary benefit of an index is to allow faster searching of particular rows in a table. The alternative to having an index is to have Postgres scan each row individually (&quot;Sequential Scan&quot;), which is of course very slow for large tables.&lt;/p&gt;
&lt;p&gt;Let&apos;s take a look behind the scenes of how Postgres determines whether to use an index.&lt;/p&gt;
&lt;h3 id=&quot;parse-analysis-how-postgres-interprets-your-query&quot; &gt;&lt;a href=&quot;#parse-analysis-how-postgres-interprets-your-query&quot; aria-label=&quot;parse analysis how postgres interprets your query permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Parse analysis: How Postgres interprets your query&lt;/h3&gt;
&lt;p&gt;When Postgres runs our query, it steps through multiple stages. At a high level, they are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Parsing (see our &lt;a href=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser#how-pg_query-turns-a-postgres-statement-into-a-parse-tree&quot;&gt;blog post on the Postgres parser&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Parse analysis&lt;/li&gt;
&lt;li&gt;Planning&lt;/li&gt;
&lt;li&gt;Execution&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Throughout these stages the query is no longer just text - it&apos;s represented as a tree. Each stage modifies and annotates the tree structure, until it&apos;s finally executed. For understanding Postgres index usage, we need to first understand what &lt;strong&gt;parse analysis&lt;/strong&gt; does.&lt;/p&gt;
&lt;p&gt;Lets pick a slightly more complex example:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; users &lt;span &gt;WHERE&lt;/span&gt; users&lt;span &gt;.&lt;/span&gt;email &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;test@example.com&apos;&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; users&lt;span &gt;.&lt;/span&gt;deleted_at &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can look at the result of parse analysis by turning on the &lt;code &gt;debug_print_parse&lt;/code&gt; setting, and then looking at the Postgres logs (not recommended on production databases):&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;LOG:  parse tree:
DETAIL:     {QUERY 
	   ...
	      :quals 
	         {BOOLEXPR 
	         :boolop and 
	         :args (
	            {OPEXPR 
	            :opno 98 
	            :opfuncid 67 
	            :opresulttype 16 
	            :opretset false 
	            :opcollid 0 
	            :inputcollid 100 
	            :args (
	               ...
	            )
	            :location 38
	            }
	            {NULLTEST 
	            :arg 
	               ...
	            :nulltesttype 0 
	            :argisrow false 
	            :location 80
	            }
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This format is a bit hard to read - let’s look at it in a more visual way, and with names instead of OIDs:&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/ded2511f7ca6069b5cb8495723df4156/aa440/postgres-parse-analysis.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Parse analysis visualization of the query&quot; title=&quot;Parse analysis visualization of the query&quot; src=&quot;https://pganalyze.com/static/ded2511f7ca6069b5cb8495723df4156/1d69c/postgres-parse-analysis.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;We can see two important parse nodes here, one for each expression in the &lt;code &gt;WHERE&lt;/code&gt; clause. The &lt;code &gt;OpExpr&lt;/code&gt; node, and the &lt;code &gt;NullTest&lt;/code&gt; node. For now, let&apos;s focus on the &lt;code &gt;OpExpr&lt;/code&gt; node.&lt;/p&gt;
&lt;h3 id=&quot;looking-behind-the-scenes-operators-and-data-types&quot; &gt;&lt;a href=&quot;#looking-behind-the-scenes-operators-and-data-types&quot; aria-label=&quot;looking behind the scenes operators and data types permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Looking behind the scenes: Operators and data types&lt;/h3&gt;
&lt;p&gt;It&apos;s important to remember that Postgres is an &lt;strong&gt;object-relational&lt;/strong&gt; database system. That is, it&apos;s designed from the ground up to be extensible. Many of the references that are added in parse analysis are not hard-coded logic, but instead reference actual database objects in the Postgres catalog tables.&lt;/p&gt;
&lt;p&gt;The two most important objects to know about are &lt;strong&gt;data types&lt;/strong&gt; and &lt;strong&gt;operators&lt;/strong&gt;. You are most likely familiar with data types in Postgres, for example you have used them when specifying the schema for your table. Operators in Postgres define how particular comparisons between one or two values, for example in a WHERE clause, are implemented.&lt;/p&gt;
&lt;p&gt;The &lt;code &gt;OpExpr&lt;/code&gt; node represents an expression that uses an operator to compare one or two values of a given type. In this case you can see we are using the &lt;code &gt;=(text, text)&lt;/code&gt; operator. This operator utilizes the &lt;code &gt;=&lt;/code&gt; symbol as its name, and has a &lt;code &gt;text&lt;/code&gt; data type on the left and right of the operator.&lt;/p&gt;
&lt;p&gt;We can query the &lt;code &gt;pg_operator&lt;/code&gt; table to see details about it, including which function implements the operator:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; oid&lt;span &gt;,&lt;/span&gt; oid::regoperator&lt;span &gt;,&lt;/span&gt; oprcode&lt;span &gt;,&lt;/span&gt; oprnegate::regoperator
  &lt;span &gt;FROM&lt;/span&gt; pg_operator
 &lt;span &gt;WHERE&lt;/span&gt; oprname &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;=&apos;&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; oprleft &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;text&apos;&lt;/span&gt;::regtype &lt;span &gt;AND&lt;/span&gt; oprright &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;text&apos;&lt;/span&gt;::regtype&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; oid |     oid      | oprcode |   oprnegate   
-----+--------------+---------+---------------
  98 | =(text,text) | texteq  | &amp;lt;&gt;(text,text)
(1 row)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And if you really want to know what’s happening, you can look up the operator&apos;s underlying &lt;code &gt;texteq&lt;/code&gt; function in &lt;a href=&quot;https://github.com/postgres/postgres/blob/REL_13_STABLE/src/backend/utils/adt/varlena.c#L1745&quot;&gt;the Postgres source&lt;/a&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * Comparison functions for text strings.
 */&lt;/span&gt;
Datum
&lt;span &gt;texteq&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;PG_FUNCTION_ARGS&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
    &lt;span &gt;if&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;lc_collate_is_c&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;collid&lt;span &gt;)&lt;/span&gt; &lt;span &gt;||&lt;/span&gt;
		collid &lt;span &gt;==&lt;/span&gt; DEFAULT_COLLATION_OID &lt;span &gt;||&lt;/span&gt;
		&lt;span &gt;pg_newlocale_from_collation&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;collid&lt;span &gt;)&lt;/span&gt;&lt;span &gt;-&gt;&lt;/span&gt;deterministic&lt;span &gt;)&lt;/span&gt;
    &lt;span &gt;{&lt;/span&gt;
        &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
    	result &lt;span &gt;=&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;memcmp&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;VARDATA_ANY&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;targ1&lt;span &gt;)&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;VARDATA_ANY&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;targ2&lt;span &gt;)&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
				  len1 &lt;span &gt;-&lt;/span&gt; VARHDRSZ&lt;span &gt;)&lt;/span&gt; &lt;span &gt;==&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
        &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
    &lt;span &gt;}&lt;/span&gt;
    &lt;span &gt;else&lt;/span&gt;
	&lt;span &gt;{&lt;/span&gt;
        &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
        result &lt;span &gt;=&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;text_cmp&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;arg1&lt;span &gt;,&lt;/span&gt; arg2&lt;span &gt;,&lt;/span&gt; collid&lt;span &gt;)&lt;/span&gt; &lt;span &gt;==&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
        &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
    &lt;span &gt;}&lt;/span&gt;
    &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That function illustrates nicely how Postgres considers the collation to determine whether it can do a fast comparison that simply compares bytes, or whether it has to do a more expensive full text comparison. As we can see from the source, using a C locale for your collation can yield performance benefits.&lt;/p&gt;
&lt;p&gt;Of course you can also define your own custom operators that work on your own custom data types. Postgres is extensible like that, and that’s actually pretty neat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operators are essential for creating the right index.&lt;/strong&gt; The operator that is used by an expression is the most important detail, besides the column name, that indicates whether a particular index can be used.&lt;/p&gt;
&lt;p&gt;You can think of operators as the &quot;how&quot; we want to search the table for values. For example, we may use a simple &lt;code &gt;=&lt;/code&gt; operator to match values for equality against an input value. Or we may utilize a more complex operator, such as &lt;code &gt;@@&lt;/code&gt; to perform a text search on a tsvector column.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/postgres-indexing&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: Effective Indexing in Postgres&quot;
        title=&quot;Download Free eBook: Effective Indexing in Postgres&quot;
        src=&quot;https://pganalyze.com/static/97b01777597bdcba8b1803935f1b7da0/acb04/ebook_promo_postgres_create_index.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;finding-the-right-index-type&quot; &gt;&lt;a href=&quot;#finding-the-right-index-type&quot; aria-label=&quot;finding the right index type permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Finding the right index type&lt;/h2&gt;
&lt;p&gt;When you think of an index type, it&apos;s important to remember that it&apos;s ultimately a specific data structure that supports a specific, limited set of search operators. For example, the most common index type in Postgres, the B-tree index, supports the &lt;code &gt;=&lt;/code&gt; operator as well as the range comparison operators (&lt;code &gt;&amp;lt;&lt;/code&gt;, &lt;code &gt;&amp;lt;=&lt;/code&gt;, &lt;code &gt;=&gt;&lt;/code&gt;, &lt;code &gt;&gt;&lt;/code&gt;), and the &lt;code &gt;~&lt;/code&gt; and &lt;code &gt;~*&lt;/code&gt; operators in some cases. It does not support any other operators.&lt;/p&gt;
&lt;p&gt;Let&apos;s say we have a &lt;code &gt;tsvector&lt;/code&gt; column on our &lt;code &gt;users&lt;/code&gt; table, and we use the &lt;code &gt;@@&lt;/code&gt; operator to search the column:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; users &lt;span &gt;WHERE&lt;/span&gt; about_text_search @@ to_tsquery&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;index&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Even if I create an index, it keeps doing a sequential scan:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users&lt;span &gt;(&lt;/span&gt;about_text_search&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;pgaweb=# EXPLAIN SELECT * FROM users WHERE about_text_search @@ to_tsquery(&apos;index&apos;);
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Seq Scan on users  (cost=10000000000.00..10000000006.51 rows=1 width=4463)
   Filter: (about_text_search @@ to_tsquery(&apos;index&apos;::text))
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is because a B-tree index does not have the correct data structure to support text searches. There is no operator class that matches B-Tree indexes and the &lt;code &gt;@@(tsvector,tsquery)&lt;/code&gt; operator.&lt;/p&gt;
&lt;p&gt;Like earlier, thanks to Postgres extensibility, we can introspect the system to understand operator classes. &lt;strong&gt;Which index type can support the &lt;code &gt;@@&lt;/code&gt; operator on a tsvector column?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We can query the internal tables to answer this question:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; am&lt;span &gt;.&lt;/span&gt;amname &lt;span &gt;AS&lt;/span&gt; index_method&lt;span &gt;,&lt;/span&gt;
       opf&lt;span &gt;.&lt;/span&gt;opfname &lt;span &gt;AS&lt;/span&gt; opfamily_name&lt;span &gt;,&lt;/span&gt;
       amop&lt;span &gt;.&lt;/span&gt;amopopr::regoperator &lt;span &gt;AS&lt;/span&gt; opfamily_operator
  &lt;span &gt;FROM&lt;/span&gt; pg_am am&lt;span &gt;,&lt;/span&gt;
       pg_opfamily opf&lt;span &gt;,&lt;/span&gt;
       pg_amop amop
 &lt;span &gt;WHERE&lt;/span&gt; opf&lt;span &gt;.&lt;/span&gt;opfmethod &lt;span &gt;=&lt;/span&gt; am&lt;span &gt;.&lt;/span&gt;oid &lt;span &gt;AND&lt;/span&gt; amop&lt;span &gt;.&lt;/span&gt;amopfamily &lt;span &gt;=&lt;/span&gt; opf&lt;span &gt;.&lt;/span&gt;oid
       &lt;span &gt;AND&lt;/span&gt; amop&lt;span &gt;.&lt;/span&gt;amopopr &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;@@(tsvector,tsquery)&apos;&lt;/span&gt;::regoperator&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; index_method | opfamily_name |  opfamily_operator   
--------------+---------------+----------------------
 gist         | tsvector_ops  | @@(tsvector,tsquery)
 gin          | tsvector_ops  | @@(tsvector,tsquery)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Looks like we need either a GIN or GIST index! We can create a GIN index like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;USING&lt;/span&gt; gin &lt;span &gt;(&lt;/span&gt;about_text_search&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And voilà, it can be used by the query:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;=# EXPLAIN SELECT * FROM users WHERE about_text_search @@ to_tsquery(&apos;index&apos;);
                                        QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Bitmap Heap Scan on users  (cost=8.25..12.51 rows=1 width=4463)
   Recheck Cond: (about_text_search @@ to_tsquery(&apos;index&apos;::text))
   -&gt;  Bitmap Index Scan on users_about_text_search_idx1  (cost=0.00..8.25 rows=1 width=0)
         Index Cond: (about_text_search @@ to_tsquery(&apos;index&apos;::text))
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;What&apos;s that &lt;code &gt;tsvector_ops&lt;/code&gt; name we saw in the internal Postgres table?&lt;/p&gt;
&lt;p&gt;That&apos;s how index types are linked to operators, using operator families and operator classes. For a given operator, there can be multiple different operator classes - an operator class defines how data is represented for a particular index type, and how the search operation for that index works to implement the operator used in a query.&lt;/p&gt;
&lt;h2 id=&quot;specifying-operator-classes-during-create-index&quot; &gt;&lt;a href=&quot;#specifying-operator-classes-during-create-index&quot; aria-label=&quot;specifying operator classes during create index permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Specifying operator classes during CREATE INDEX&lt;/h2&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/b241326f6c407f6efae549183a798b2a/aa440/postgres-operator-class.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;CREATE INDEX command with default operator class&quot; title=&quot;CREATE INDEX command with default operator class&quot; src=&quot;https://pganalyze.com/static/b241326f6c407f6efae549183a798b2a/1d69c/postgres-operator-class.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;p&gt;For example let’s look at &lt;code &gt;=(text,text)&lt;/code&gt;, which is the operator used in an earlier query:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; am&lt;span &gt;.&lt;/span&gt;amname &lt;span &gt;AS&lt;/span&gt; index_method&lt;span &gt;,&lt;/span&gt;
       opf&lt;span &gt;.&lt;/span&gt;opfname &lt;span &gt;AS&lt;/span&gt; opfamily_name&lt;span &gt;,&lt;/span&gt;
       amop&lt;span &gt;.&lt;/span&gt;amopopr::regoperator &lt;span &gt;AS&lt;/span&gt; opfamily_operator
  &lt;span &gt;FROM&lt;/span&gt; pg_am am&lt;span &gt;,&lt;/span&gt;
       pg_opfamily opf&lt;span &gt;,&lt;/span&gt;
       pg_amop amop
 &lt;span &gt;WHERE&lt;/span&gt; opf&lt;span &gt;.&lt;/span&gt;opfmethod &lt;span &gt;=&lt;/span&gt; am&lt;span &gt;.&lt;/span&gt;oid &lt;span &gt;AND&lt;/span&gt; amop&lt;span &gt;.&lt;/span&gt;amopfamily &lt;span &gt;=&lt;/span&gt; opf&lt;span &gt;.&lt;/span&gt;oid
       &lt;span &gt;AND&lt;/span&gt; amop&lt;span &gt;.&lt;/span&gt;amopopr &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;=(text,text)&apos;&lt;/span&gt;::regoperator&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; index_method |  opfamily_name   | opfamily_operator 
--------------+------------------+-------------------
 btree        | text_ops         | =(text,text)
 hash         | text_ops         | =(text,text)
 btree        | text_pattern_ops | =(text,text)
 hash         | text_pattern_ops | =(text,text)
 spgist       | text_ops         | =(text,text)
 brin         | text_minmax_ops  | =(text,text)
 gist         | gist_text_ops    | =(text,text)
(7 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can see there is a default operator class (&lt;code &gt;text_ops&lt;/code&gt;) that gets used when you don’t explicitly specify it - for text columns the default operator class is often all you need.&lt;/p&gt;
&lt;p&gt;But there are cases where we want to set a particular operator class. For example, let&apos;s say we run a LIKE query on our database, and our database happens to use the en_US.UTF-8 collation - in that case, you will see the LIKE query is not actually able to use an index:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;pgaweb=# EXPLAIN SELECT * FROM users WHERE email LIKE &apos;lukas@%&apos;;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Seq Scan on users  (cost=10000000000.00..10000000001.26 rows=1 width=4463)
   Filter: ((email)::text ~~ &apos;lukas@%&apos;::text)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Generally, LIKE queries are challenging to index, but if you do not have a leading wildcard, an index can be created that works for them - but you need to either (1) use the C locale on your database (effectively saying you don’t want language-specific text sorting/comparison), or (2) use the &lt;code &gt;text_pattern_ops&lt;/code&gt; operator class.&lt;/p&gt;
&lt;p&gt;Let’s create the same index, but this time specify the &lt;code &gt;text_pattern_ops&lt;/code&gt; operator class:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email text_pattern_ops&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;pgaweb=# EXPLAIN SELECT * FROM users WHERE email LIKE &apos;lukas@%&apos;;
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Index Scan using users_email_idx on users  (cost=0.14..8.16 rows=1 width=4463)
   Index Cond: (((email)::text ~&gt;=~ &apos;lukas@&apos;::text) AND ((email)::text ~&amp;lt;~ &apos;lukasA&apos;::text))
   Filter: ((email)::text ~~ &apos;lukas@%&apos;::text)
(3 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see now the same &lt;code &gt;LIKE&lt;/code&gt; query can use the index.&lt;/p&gt;
&lt;p&gt;Now that we know our index type and operator class for our columns, let&apos;s look at a few other aspects of creating an index.&lt;/p&gt;
&lt;h2 id=&quot;specifying-multiple-columns-when-adding-a-postgres-index&quot; &gt;&lt;a href=&quot;#specifying-multiple-columns-when-adding-a-postgres-index&quot; aria-label=&quot;specifying multiple columns when adding a postgres index permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Specifying multiple columns when adding a Postgres index&lt;/h2&gt;
&lt;p&gt;One essential feature is the option to add multiple columns to an index definition.&lt;/p&gt;
&lt;p&gt;You can do it simply like that:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;CREATE INDEX ON [table] ([column_a], [column_b]);&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;But what does that actually do? Turns out it’s dependent on the index type. Each index type has a different representation for multiple columns in its data structure. And some index types like BRIN or Hash do not support multiple columns.&lt;/p&gt;
&lt;p&gt;However with the most common index type, B-tree, multi-column indexes work well, and they are commonly used. &lt;strong&gt;The most important thing to know for multi-column B-tree indexes&lt;/strong&gt;: Column order matters. If you have some queries that only utilize &lt;code &gt;column_a&lt;/code&gt;, but all queries utilize &lt;code &gt;column_b&lt;/code&gt;, you should put &lt;code &gt;column_b&lt;/code&gt; first in your index definition. If you don’t follow this rule, you will end up with queries doing a lot more work because they have to skip over all the earlier columns that they can’t filter on. With GIST indexes on the other hand, this does not matter - and you can specify columns in any order.&lt;/p&gt;
&lt;p&gt;Another decision to make is: &lt;strong&gt;Should I create multiple indexes, one for each column I’m querying by, or should I create a single multi-column index?&lt;/strong&gt;&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;table&lt;/span&gt;&lt;span &gt;]&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;column_a&lt;span &gt;]&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;table&lt;/span&gt;&lt;span &gt;]&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;column_b&lt;span &gt;]&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;--- or&lt;/span&gt;
&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;table&lt;/span&gt;&lt;span &gt;]&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;column_a&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;column_b&lt;span &gt;]&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When looking at an individual query, the answer will almost always be: Create a single multi-column index that matches the query. It will be faster than having multiple indexes.&lt;/p&gt;
&lt;p&gt;But if you have a larger workload, it may make sense to create multiple single-column indexes. Be aware that Postgres will have to do more work in that case, and you should verify what indexes actually get chosen by looking at your EXPLAIN plans.&lt;/p&gt;
&lt;h2 id=&quot;using-functions-and-expressions-in-an-index-definition&quot; &gt;&lt;a href=&quot;#using-functions-and-expressions-in-an-index-definition&quot; aria-label=&quot;using functions and expressions in an index definition permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using functions and expressions in an index definition&lt;/h2&gt;
&lt;p&gt;Stepping back from specific index types for a moment: Postgres has a universal feature that applies to all index types, that&apos;s pretty useful: Instead of indexing a particular column&apos;s value, you can index an expression that references the column&apos;s data.&lt;/p&gt;
&lt;p&gt;For example, we might typically compare our user email addresses with the &lt;code &gt;lower(..)&lt;/code&gt; function:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; users &lt;span &gt;WHERE&lt;/span&gt; lower&lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; $&lt;span &gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you were to run EXPLAIN on this, you would notice that Postgres is not able to use a simple index on &lt;code &gt;email&lt;/code&gt; here - since it doesn’t match the expression.&lt;/p&gt;
&lt;p&gt;But since &lt;code &gt;lower(..)&lt;/code&gt; is what’s called a &quot;immutable&quot; function, we can use it to create an expression index, that indexes all values of email with their lower-case form:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;lower&lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now our query will be able to use the index. Note that this does not work for all functions. For example, if you were to create an index on &lt;code &gt;now()&lt;/code&gt;, it would fail:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;&lt;span &gt;now&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;ERROR:  functions in index expression must be marked IMMUTABLE&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Additionally, &lt;strong&gt;remember that expression indexes only work when they match the query.&lt;/strong&gt; If we only have an index on &lt;code &gt;lower(email)&lt;/code&gt;, a query that simply references &lt;code &gt;email&lt;/code&gt; won’t be able to use the index.&lt;/p&gt;
&lt;h2 id=&quot;specifying-a-where-clause-to-create-partial-postgresql-indexes&quot; &gt;&lt;a href=&quot;#specifying-a-where-clause-to-create-partial-postgresql-indexes&quot; aria-label=&quot;specifying a where clause to create partial postgresql indexes permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Specifying a WHERE clause to create partial PostgreSQL indexes&lt;/h2&gt;
&lt;p&gt;Let’s return to an example we saw at the beginning of the post - but now let’s look at the &lt;code &gt;NullTest&lt;/code&gt; expression:&lt;/p&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/ded2511f7ca6069b5cb8495723df4156/aa440/postgres-parse-analysis.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Parse analysis visualization of the query&quot; title=&quot;Parse analysis visualization of the query&quot; src=&quot;https://pganalyze.com/static/ded2511f7ca6069b5cb8495723df4156/1d69c/postgres-parse-analysis.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;p&gt;Here we are making sure we only get rows that are not yet marked as deleted by our application. Depending on your workload, this may be a very large number of rows that needs to be skipped over.&lt;/p&gt;
&lt;p&gt;Whilst you could create an index that includes the &lt;code &gt;deleted_at&lt;/code&gt; column, it would be quite wasteful to have all these index entries that you don’t actually want to ever look at.&lt;/p&gt;
&lt;p&gt;Postgres has a better way: With partial indexes, you can restrict for which rows the index has index entries. When the restriction does not apply, the row won’t be saved to the index, saving space. And during query execution, this also acts as a significant time saver in many cases, since the planner can do a simple check to determine which partial indexes match, and ignore all that don&apos;t match.&lt;/p&gt;
&lt;p&gt;In practice, all you need to do is add a &lt;code &gt;WHERE&lt;/code&gt; clause to your index definition:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users&lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; deleted_at &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There are reasons why you may not want to do that though:&lt;/p&gt;
&lt;p&gt;First, adding this restriction means that only queries that contain &lt;code &gt;deleted_at IS NULL&lt;/code&gt; will be able to use the index. That means you may need two indexes, one with that restriction and the other without.&lt;/p&gt;
&lt;p&gt;Second, adding hundreds or thousands of partial indexes causes overhead in the Postgres planner, as it has to do a more expensive analysis to determine which indexes can be used.&lt;/p&gt;
&lt;h2 id=&quot;using-include-to-create-a-covering-index-for-index-only-scans&quot; &gt;&lt;a href=&quot;#using-include-to-create-a-covering-index-for-index-only-scans&quot; aria-label=&quot;using include to create a covering index for index only scans permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using INCLUDE to create a covering index for Index-Only Scans&lt;/h2&gt;
&lt;p&gt;Last but not least, let’s talk about a more recent addition to Postgres: The &lt;code &gt;INCLUDE&lt;/code&gt; keyword that can be added to &lt;code &gt;CREATE INDEX&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Before we look at what this keyword does, let’s understand the difference between an Index Scan and an Index-Only Scan. An Index-Only Scan is possible when all data that is needed can be retrieved from the index itself - instead of having to fetch it from disk.&lt;/p&gt;
&lt;p&gt;Note that Index-Only scans only work when the table has been recently VACUUMed - otherwise Postgres will need to check visibility too often for each index entry, and therefore does not opt to use Index-Only Scans, preferring an Index Scan instead in most cases.&lt;/p&gt;
&lt;p&gt;Let&apos;s look at two examples - one query that matches an index fully, and one that does not (because of the target list):&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;,&lt;/span&gt; id&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;=# EXPLAIN SELECT id FROM users WHERE email = &apos;test@example.com&apos;;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Index Only Scan using users_email_id_idx on users  (cost=0.14..4.16 rows=1 width=4)
   Index Cond: (email = &apos;test@example.com&apos;::text)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;=# EXPLAIN SELECT id, fullname FROM users WHERE email = &apos;test@example.com&apos;;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Index Scan using users_email_id_idx on users  (cost=0.14..8.15 rows=1 width=520)
   Index Cond: ((email)::text = &apos;test@example.com&apos;::text)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, to get an Index Only Scan for the second query we can create an index that includes that column at the end - and that makes Postgres use an Index Only Scan:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;,&lt;/span&gt; id&lt;span &gt;,&lt;/span&gt; fullname&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;=# EXPLAIN SELECT id, fullname FROM users WHERE email = &apos;test@example.com&apos;;
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Index Only Scan using users_email_id_fullname_idx on users  (cost=0.14..4.16 rows=1 width=520)
   Index Cond: (email = &apos;test@example.com&apos;::text)
(2 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;However, doing this has a few restrictions: It doesn’t work if you have unique indexes (since any column would modify what’s being checked for being unique), and it bloats the data stored in the index for searching.&lt;/p&gt;
&lt;p&gt;For B-tree indexes the new INCLUDE keyword is the better approach:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;,&lt;/span&gt; id&lt;span &gt;)&lt;/span&gt; INCLUDE &lt;span &gt;(&lt;/span&gt;fullname&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This keeps the overhead for such additional columns slightly lower, works without problems with UNIQUE constraint indexes, and clearly communicates the intent: That you only added a column in order to support Index Only Scans.&lt;/p&gt;
&lt;p&gt;This is a feature best used sparingly: Adding more data to the index means larger index values, which on its own can be a problem - it’s usually not a good idea to just add a lot of columns to the INCLUDE clause for an index.&lt;/p&gt;
&lt;h2 id=&quot;adding-and-dropping-postgresql-indexes-safely-on-production&quot; &gt;&lt;a href=&quot;#adding-and-dropping-postgresql-indexes-safely-on-production&quot; aria-label=&quot;adding and dropping postgresql indexes safely on production permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Adding and dropping PostgreSQL indexes safely on production&lt;/h2&gt;
&lt;p&gt;I’ll end with a warning: Creating indexes on production databases requires a bit of thought. Not just which index definition to use, but also how to create them, and when to take the I/O impact of the new index being built.&lt;/p&gt;
&lt;p&gt;The most important thing: Remember that Postgres will take an exclusive lock when you simply run &lt;code &gt;CREATE INDEX&lt;/code&gt;, that will block all reads and writes to that table. That’s why Postgres has the special &lt;code &gt;CONCURRENTLY&lt;/code&gt; keyword. When you create an index on a table on production that already has data, always specify this keyword:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;INDEX&lt;/span&gt; CONCURRENTLY &lt;span &gt;ON&lt;/span&gt; users &lt;span &gt;(&lt;/span&gt;email&lt;span &gt;)&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; deleted_at &lt;span &gt;IS&lt;/span&gt; &lt;span &gt;NULL&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is the same when dropping an index with &lt;code &gt;DROP INDEX&lt;/code&gt; - adding &lt;code &gt;CONCURRENTLY&lt;/code&gt; reduces the locking requirements slightly, making it faster to use this operation on production.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this post you should have gotten a fundamental understanding of how operators and operator classes related to indexing, and why knowing these concepts is essential to creating the best index for complex queries. We also looked at a few complimentary features of the &lt;code &gt;CREATE INDEX&lt;/code&gt; command, that are typically needed when reasoning about which index to create.&lt;/p&gt;
&lt;p&gt;There are actually a few things we didn’t talk about: Adding indexes to specific tablespaces, using index storage parameters (especially useful for GIN index types!) and specifying the sort order for a particular column. I encourage you to take a further look at the &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-createindex.html&quot;&gt;Postgres documentation for these topics&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Share this article:&lt;/strong&gt; If you liked this article you might want to &lt;a href=&quot;https://twitter.com/intent/tweet?text=%22Postgres%20CREATE%20INDEX:%20Operator%20classes,%20index%20types%20and%20more%22%20-%20This%20post%20by%20%40pganalyze%20offers%20a%20behind%20the%20scenes%20look%20at%20adding%20the%20best%20index%20and%20explains%20how%20to%20match%20index%20definitions%20to%20queries%3A%20https://pganalyze.com/blog/postgres-create-index&quot;&gt;tweet it to your peers&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[A look at Postgres 14: Performance and Monitoring Improvements]]></title><description><![CDATA[The first beta release of the upcoming Postgres 14 release was made available yesterday. In this article we'll take a first look at what's in the beta, with an emphasis on one major performance improvement, as well as three monitoring improvements that caught our attention. Before we get started, I wanted to highlight what always strikes me as an important unique aspect of Postgres: Compared to most other open-source database systems, Postgres is not the project of a single company, but rather…]]></description><link>https://pganalyze.com/blog/postgres-14-performance-monitoring</link><guid isPermaLink="false">https://pganalyze.com/blog/postgres-14-performance-monitoring</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Fri, 21 May 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;The first beta release of the upcoming Postgres 14 release was &lt;a href=&quot;https://www.postgresql.org/about/news/postgresql-14-beta-1-released-2213/&quot;&gt;made available yesterday&lt;/a&gt;. In this article we&apos;ll take a first look at what&apos;s in the beta, with an emphasis on one major performance improvement, as well as three monitoring improvements that caught our attention.&lt;/p&gt;
&lt;p&gt;Before we get started, I wanted to highlight what always strikes me as an important unique aspect of Postgres: Compared to most other open-source database systems, &lt;strong&gt;Postgres is not the project of a single company&lt;/strong&gt;, but rather many individuals coming together to work on a new release, year after year. And that includes everyone who tries out the beta releases, and &lt;a href=&quot;https://www.postgresql.org/developer/beta/&quot;&gt;reports bugs to the Postgres project&lt;/a&gt;. We hope this post inspires you to do your own testing and benchmarking.&lt;/p&gt;
&lt;p&gt;Now, I&apos;m personally most excited about &lt;strong&gt;better connection scaling in Postgres 14&lt;/strong&gt;. For this post we ran a detailed benchmark comparing Postgres 13.3 to 14 beta1 (note that the connection count is log scale):&lt;/p&gt;
&lt;p&gt;
&lt;span  &gt;
      &lt;a  src=&quot;https://pganalyze.com/static/91c9d9c27c4ca41d34699b52f3861794/22252/connection_scaling.png&quot;  target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span  &gt;&lt;/span&gt;
  &lt;img  alt=&quot;Connection Scaling Benchmark Numbers comparing Postgres 13.3 and Postgres 14 beta1&quot; title=&quot;Connection Scaling Benchmark Numbers comparing Postgres 13.3 and Postgres 14 beta1&quot; src=&quot;https://pganalyze.com/static/91c9d9c27c4ca41d34699b52f3861794/1d69c/connection_scaling.png&quot;    loading=&quot;lazy&quot; decoding=&quot;async&quot;&gt;
  &lt;/a&gt;
    &lt;/span&gt;
&lt;/p&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#improved-active-and-idle-connection-scaling-in-postgres-14&quot;&gt;Improved Active and Idle Connection Scaling in Postgres 14&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#dive-into-memory-use-with-pg_backend_memory_contexts&quot;&gt;Dive into memory use with pg_backend_memory_contexts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#track-wal-activity-with-pg_stat_wal&quot;&gt;Track WAL activity with pg_stat_wal&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#monitor-queries-with-the-built-in-postgres-query_id&quot;&gt;Monitor queries with the built-in Postgres query_id&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#and-200-other-improvements-in-the-postgres-14-release&quot;&gt;And 200+ other improvements in the Postgres 14 release!&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id=&quot;improved-active-and-idle-connection-scaling-in-postgres-14&quot; &gt;&lt;a href=&quot;#improved-active-and-idle-connection-scaling-in-postgres-14&quot; aria-label=&quot;improved active and idle connection scaling in postgres 14 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Improved Active and Idle Connection Scaling in Postgres 14&lt;/h2&gt;
&lt;p&gt;Postgres 14 brings significant improvements for those of us that need a high number of database connections. The Postgres connection model relies on processes instead of threads. This has some important benefits, but it also has overhead at large connection counts. With this new release, scaling active and idle connections has gotten significantly better, and will be a major improvement for the most demanding applications.&lt;/p&gt;
&lt;p&gt;For our test, we&apos;ve used two 96 vCore AWS instances (c5.24xlarge), one running Postgres 13.3, and one running Postgres 14 beta1. Both of these use Ubuntu 20.04, with the default system settings, but the Postgres connection limit has been increased to 11,000 connections.&lt;/p&gt;
&lt;p&gt;We use &lt;a href=&quot;https://www.postgresql.org/docs/current/pgbench.html&quot;&gt;pgbench&lt;/a&gt; to test connection scaling of active connections. To start, we initialize the database with pgbench scale factor 200:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;# Postgres 13.3
$ pgbench -i -s 200
...
done in 127.71 s (drop tables 0.02 s, create tables 0.02 s, client-side generate 81.74 s, vacuum 2.63 s, primary keys 43.30 s).&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;# Postgres 14 beta1
$ pgbench -i -s 200
...
done in 77.33 s (drop tables 0.02 s, create tables 0.02 s, client-side generate 48.19 s, vacuum 2.70 s, primary keys 26.40 s).&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Already here we can see that Postgres 14 does much better in the initial data load.&lt;/p&gt;
&lt;p&gt;We now launch read-only pgbench with a varying set of active connections, showing 5,000 concurrent connections as an example of a very active workload:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;# Postgres 13.3
$ pgbench -S -c 5000 -j 96 -M prepared -T30
...
tps = 417847.658491 (excluding connections establishing)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;# Postgres 14 beta1
$ pgbench -S -c 5000 -j 96 -M prepared -T30
...
tps = 495108.316805 (without initial connection time)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, the throughput of Postgres 14 at 5000 active connections is about 20% higher. &lt;strong&gt;At 10,000 active connections the improvement is 50% over Postgres 13&lt;/strong&gt;, and at lower connection counts you can also see consistent improvements.&lt;/p&gt;
&lt;p&gt;Note that you will usually see a noticeable TPS drop when the number of connections exceeds the number of CPUs, this is most likely due to CPU scheduling overhead, and not a limitation in Postgres itself. Now, most workloads don&apos;t actually have this many active connections, but rather a high number of idle connections.&lt;/p&gt;
&lt;p&gt;The original author of this work, &lt;a href=&quot;https://twitter.com/andresfreundtec&quot;&gt;Andres Freund&lt;/a&gt;, ran a benchmark on the throughput of a single active query, whilst also running 10,000 idle connections. The query went from 15,000 TPS to almost 35,000 TPS - that&apos;s over 2x better than in Postgres 13. You can find all the details in &lt;strong&gt;&lt;a href=&quot;https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/improving-postgres-connection-scalability-snapshots/ba-p/1806462#fn:1&quot;&gt;Andres Freund&apos;s original post introducing these improvements&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&quot;dive-into-memory-use-with-pg_backend_memory_contexts&quot; &gt;&lt;a href=&quot;#dive-into-memory-use-with-pg_backend_memory_contexts&quot; aria-label=&quot;dive into memory use with pg_backend_memory_contexts permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dive into memory use with pg_backend_memory_contexts&lt;/h2&gt;
&lt;p&gt;Have you ever been curious why a certain Postgres connection is taking up a higher amount of memory? With the new &lt;code &gt;pg_backend_memory_contexts&lt;/code&gt; view you can take a close look at what exactly is allocated for a given Postgres process.&lt;/p&gt;
&lt;p&gt;To start, we can calculate how much memory is used by our current connection in total:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; pg_size_pretty&lt;span &gt;(&lt;/span&gt;&lt;span &gt;SUM&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;used_bytes&lt;span &gt;)&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_backend_memory_contexts&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; pg_size_pretty 
----------------
 939 kB
(1 row)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, let&apos;s dive a bit deeper. When we query the table for the top 5 entries by memory usage, you will notice there is actually a lot of detailed information:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_backend_memory_contexts &lt;span &gt;ORDER&lt;/span&gt; &lt;span &gt;BY&lt;/span&gt; used_bytes &lt;span &gt;DESC&lt;/span&gt; &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;5&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;          name           | ident |      parent      | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes 
-------------------------+-------+------------------+-------+-------------+---------------+------------+-------------+------------
 CacheMemoryContext      |       | TopMemoryContext |     1 |      524288 |             7 |      64176 |           0 |     460112
 Timezones               |       | TopMemoryContext |     1 |      104120 |             2 |       2616 |           0 |     101504
 TopMemoryContext        |       |                  |     0 |       68704 |             5 |      13952 |          12 |      54752
 WAL record construction |       | TopMemoryContext |     1 |       49768 |             2 |       6360 |           0 |      43408
 MessageContext          |       | TopMemoryContext |     1 |       65536 |             4 |      22824 |           0 |      42712
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A memory context in Postgres is a memory region that is used for allocations to support activities such as query planning or query execution. Once Postgres completes work in a context, the whole context can be freed, simplifying memory handling. Through the use of memory contexts the Postgres source actually avoids doing manual &lt;code &gt;free&lt;/code&gt; calls for the most part (even though it&apos;s written in C), instead relying on memory contexts to clean up memory in groups. The top memory context here, CacheMemoryContext is used for many long-lived caches in Postgres.&lt;/p&gt;
&lt;p&gt;We can illustrate the impact of loading additional tables into a connection by running a query on a new table, and then querying the view again:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; test3&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_backend_memory_contexts &lt;span &gt;ORDER&lt;/span&gt; &lt;span &gt;BY&lt;/span&gt; used_bytes &lt;span &gt;DESC&lt;/span&gt; &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;5&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;          name           | ident |      parent      | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes 
-------------------------+-------+------------------+-------+-------------+---------------+------------+-------------+------------
 CacheMemoryContext      |       | TopMemoryContext |     1 |      524288 |             7 |      61680 |           1 |     462608
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see the new view illustrates that simply having queried a table on this connection will retain about 2kb of memory, even after the query has finished. This caching of table information is done to speed up future queries, but can sometimes cause surprising amounts of memory usage for multi-tenant databases with many different schemas. You can now illustrate such issues easily through this new monitoring view.&lt;/p&gt;
&lt;p&gt;If you&apos;d like to access this information for processes other than the current one, you can use the new &lt;code &gt;pg_log_backend_memory_contexts&lt;/code&gt; function which will cause the specified process to output its own memory consumption to the Postgres log:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; pg_log_backend_memory_contexts&lt;span &gt;(&lt;/span&gt;&lt;span &gt;10377&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;LOG:  logging memory contexts of PID 10377
STATEMENT:  SELECT pg_log_backend_memory_contexts(pg_backend_pid());
LOG:  level: 0; TopMemoryContext: 80800 total in 6 blocks; 14432 free (5 chunks); 66368 used
LOG:  level: 1; pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 1408 free (0 chunks); 6784 used
LOG:  level: 1; TopTransactionContext: 8192 total in 1 blocks; 7720 free (1 chunks); 472 used
LOG:  level: 1; RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 chunks); 1312 used
LOG:  level: 1; MessageContext: 16384 total in 2 blocks; 5152 free (0 chunks); 11232 used
LOG:  level: 1; Operator class cache: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
LOG:  level: 1; smgr relation table: 16384 total in 2 blocks; 4544 free (3 chunks); 11840 used
LOG:  level: 1; TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used
...
LOG:  level: 1; ErrorContext: 8192 total in 1 blocks; 7928 free (3 chunks); 264 used
LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560 used&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&quot;track-wal-activity-with-pg_stat_wal&quot; &gt;&lt;a href=&quot;#track-wal-activity-with-pg_stat_wal&quot; aria-label=&quot;track wal activity with pg_stat_wal permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Track WAL activity with pg_stat_wal&lt;/h2&gt;
&lt;p&gt;Building on the WAL monitoring capabilities in Postgres 13, the new release brings a new server-wide summary view for WAL information, called &lt;code &gt;pg_stat_wal&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can use this to monitor WAL writes over time more easily:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stat_wal&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;-[ RECORD 1 ]----+------------------------------
wal_records      | 3334645
wal_fpi          | 8480
wal_bytes        | 282414530
wal_buffers_full | 799
wal_write        | 429769
wal_sync         | 428912
wal_write_time   | 0
wal_sync_time    | 0
stats_reset      | 2021-05-21 07:33:22.941452+00&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With this new view we can get summary information such as how many Full Page Images (FPI) were written to the WAL, which can give you insights on when Postgres generated a lot of WAL records due to a checkpoint. Secondly, you can use the new &lt;code &gt;wal_buffers_full&lt;/code&gt; counter to quickly see when the &lt;code &gt;wal_buffers&lt;/code&gt; setting is set too low, which can cause unnecessary I/O that can be prevented by raising wal_buffers to a higher value.&lt;/p&gt;
&lt;p&gt;You can also get more details of the I/O impact of WAL writes by enabling the optional &lt;code &gt;track_wal_io_timing&lt;/code&gt; setting, which then gives you the exact I/O times for WAL writes, and WAL file syncs to disk. Note this setting can have noticeable overhead, so it&apos;s best turned off (the default) unless needed.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;monitor-queries-with-the-built-in-postgres-query_id&quot; &gt;&lt;a href=&quot;#monitor-queries-with-the-built-in-postgres-query_id&quot; aria-label=&quot;monitor queries with the built in postgres query_id permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Monitor queries with the built-in Postgres query_id&lt;/h2&gt;
&lt;p&gt;In a recent &lt;a href=&quot;https://www.timescale.com/state-of-postgres-results/#top-three&quot;&gt;survey done by TimescaleDB&lt;/a&gt; in March and April 2021, the &lt;code &gt;pg_stat_statements&lt;/code&gt; extension was named one of the top three extensions the surveyed user base uses with Postgres. &lt;code &gt;pg_stat_statements&lt;/code&gt; is bundled with Postgres, and with Postgres 14 one of the important features of the extensions got merged into core Postgres:&lt;/p&gt;
&lt;p&gt;The calculation of the &lt;code &gt;query_id&lt;/code&gt;, which uniquely identifies a query, whilst ignoring constant values. Thus, if you run the same query again it will have the same &lt;code &gt;query_id&lt;/code&gt;, enabling you to identify workload patterns on the database. Previously this information was only available with &lt;code &gt;pg_stat_statements&lt;/code&gt;, which shows aggregate statistics about queries that have finished executing, but now this is available with &lt;code &gt;pg_stat_activity&lt;/code&gt; as well as in log files.&lt;/p&gt;
&lt;p&gt;First we have to enable the new &lt;code &gt;compute_query_id&lt;/code&gt; setting and restart Postgres afterwards:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;ALTER&lt;/span&gt; SYSTEM &lt;span &gt;SET&lt;/span&gt; compute_query_id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;on&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you use &lt;code &gt;pg_stat_statements&lt;/code&gt; query IDs will be calculated by automatically, through the default &lt;code &gt;compute_query_id&lt;/code&gt; setting of &lt;code &gt;auto&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;With query IDs enabled, we can look at &lt;code &gt;pg_stat_activity&lt;/code&gt; during a pgbench run and see why this is helpful as compared to just looking at query text:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; query&lt;span &gt;,&lt;/span&gt; query_id &lt;span &gt;FROM&lt;/span&gt; pg_stat_activity &lt;span &gt;WHERE&lt;/span&gt; backend_type &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;client backend&apos;&lt;/span&gt; &lt;span &gt;LIMIT&lt;/span&gt; &lt;span &gt;5&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;                                 query                                  |      query_id      
------------------------------------------------------------------------+--------------------
 UPDATE pgbench_tellers SET tbalance = tbalance + -4416 WHERE tid = 3;  | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + -2979 WHERE tid = 10; | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + 2560 WHERE tid = 6;   | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + -65 WHERE tid = 7;    | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + -136 WHERE tid = 9;   | 885704527939071629
(5 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;All of these queries are the same from an application perspective, but their text is slightly different, making it hard to find patterns in the workload. With the query ID however we can clearly identify the number of certain kinds of queries, and assess performance problems more easily. For example, we can group by the query ID to see what&apos;s keeping the database busy:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;COUNT&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; state&lt;span &gt;,&lt;/span&gt; query_id &lt;span &gt;FROM&lt;/span&gt; pg_stat_activity &lt;span &gt;WHERE&lt;/span&gt; backend_type &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;client backend&apos;&lt;/span&gt; &lt;span &gt;GROUP&lt;/span&gt; &lt;span &gt;BY&lt;/span&gt; &lt;span &gt;2&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;3&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt; count | state  |       query_id       
-------+--------+----------------------
    40 | active |   885704527939071629
     9 | active |  7660508830961861980
     1 | active | -7810315603562552972
     1 | active | -3907106720789821134
(4 rows)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When you run this on your own system you may find that the query ID is different from the one shown here. This is due to query IDs being dependent on the internal representation of a Postgres query, which can be architecture dependent, and also considers internal IDs of tables instead of their names.&lt;/p&gt;
&lt;p&gt;The query ID information is also available in &lt;code &gt;log_line_prefix&lt;/code&gt; through the new %Q option, making it easier to get auto_explain output thats linked to a query:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;2021-05-21 08:18:02.949 UTC [7176] [user=postgres,db=postgres,app=pgbench,query=885704527939071629] LOG:  duration: 59.827 ms  plan:
	Query Text: UPDATE pgbench_tellers SET tbalance = tbalance + -1902 WHERE tid = 6;
	Update on pgbench_tellers  (cost=4.14..8.16 rows=0 width=0) (actual time=59.825..59.826 rows=0 loops=1)
	  -&gt;  Bitmap Heap Scan on pgbench_tellers  (cost=4.14..8.16 rows=1 width=10) (actual time=0.009..0.011 rows=1 loops=1)
	        Recheck Cond: (tid = 6)
	        Heap Blocks: exact=1
	        -&gt;  Bitmap Index Scan on pgbench_tellers_pkey  (cost=0.00..4.14 rows=1 width=0) (actual time=0.003..0.004 rows=1 loops=1)
	              Index Cond: (tid = 6)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Want to link &lt;code &gt;auto_explain&lt;/code&gt; and &lt;code &gt;pg_stat_statements&lt;/code&gt;, and can&apos;t wait for Postgres 14?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We built our &lt;a src=&quot;https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser#fingerprints-in-pg_query-a-better-way-to-check-if-two-queries-are-identical&quot;&gt;own open-source query fingerprint mechanism&lt;/a&gt; that uniquely identifies queries based on their text. This is used in pganalyze for matching EXPLAIN plans to queries, and you can also use this in your own scripts, with any Postgres version.&lt;/p&gt;
&lt;h2 id=&quot;and-200-other-improvements-in-the-postgres-14-release&quot; &gt;&lt;a href=&quot;#and-200-other-improvements-in-the-postgres-14-release&quot; aria-label=&quot;and 200 other improvements in the postgres 14 release permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;And 200+ other improvements in the Postgres 14 release!&lt;/h2&gt;
&lt;p&gt;These are just some of the many improvements in the new Postgres release. You can find more on what&apos;s new in the &lt;a href=&quot;https://www.postgresql.org/docs/14/release-14.html&quot;&gt;release notes&lt;/a&gt;, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The new predefined roles &lt;code &gt;pg_read_all_data&lt;/code&gt;/&lt;code &gt;pg_write_all_data&lt;/code&gt; give global read or write access&lt;/li&gt;
&lt;li&gt;Automatic cancellation of long-running queries if the client disconnects&lt;/li&gt;
&lt;li&gt;Vacuum now skips index vacuuming when the number of removable index entries is insignificant&lt;/li&gt;
&lt;li&gt;Per-index information is now included in autovacuum logging output&lt;/li&gt;
&lt;li&gt;Partitions can now be detached in a non-blocking manner with &lt;code &gt;ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And many more. &lt;strong&gt;Now is the time to help test!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Download beta1 from the &lt;a href=&quot;https://www.postgresql.org/download/&quot;&gt;official package repositories&lt;/a&gt;, or build it from source. We can all contribute to making Postgres 14 a stable release in a few months from now.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;At pganalyze, we&apos;re excited about Postgres 14, and hope this post got you interested as well! Postgres shows again how many small improvements make it a stable, trustworthy database, that is built by the community, for the community.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://twitter.com/intent/tweet?text=%22An%20early%20look%20at%20%23Postgres14%20Performance%20and%20Monitoring%20Improvements%22%20-%20Here,%20%40pganalyze%20looks%20at%20idle%20and%20active%20connection%20scaling,%20memory%20monitoring,%20query%20IDs,%20and%20more%3A%20https%3A%2F%2Fpganalyze.com%2Fblog%2Fpostgres-14-performance-monitoring&quot;&gt;Share this post on Twitter&lt;/a&gt;&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Introducing pg_query 2.0: The easiest way to parse Postgres queries]]></title><description><![CDATA[The query parser is a core component of Postgres: the database needs to understand what data you're asking for in order to return the right results. But this functionality is also useful for all sorts of other tools that work with Postgres queries. A few years ago, we released pg_query to support this functionality in a standalone C library. pganalyze uses pg_query to parse and analyze every SQL query that runs on your Postgres database. Our initial motivation was to create pg_query for checking…]]></description><link>https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser</link><guid isPermaLink="false">https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 18 Mar 2021 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/75f87b67b2112c7051bbb7a763b21a0b/aa440/query_parsing_intro_image.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Parsing of &amp;quot;SELECT * FROM mytable&amp;quot; SQL statement into the associated Postgres parse tree&quot;
        title=&quot;Parsing of &amp;quot;SELECT * FROM mytable&amp;quot; SQL statement into the associated Postgres parse tree&quot;
        src=&quot;https://pganalyze.com/static/75f87b67b2112c7051bbb7a763b21a0b/1d69c/query_parsing_intro_image.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The query parser is a core component of Postgres: the database needs to understand what data you&apos;re asking for in order to return the right results. But this functionality is also useful for all sorts of other tools that work with Postgres queries. A few years ago, &lt;a href=&quot;https://pganalyze.com/blog/parse-postgresql-queries-in-ruby&quot;&gt;we released pg_query&lt;/a&gt; to support this functionality in a standalone C library.&lt;/p&gt;
&lt;p&gt;pganalyze uses pg_query to &lt;strong&gt;parse and analyze every SQL query that runs on your Postgres database&lt;/strong&gt;. Our initial motivation was to create pg_query for checking which tables a query references, or what kind of statement it is. Since then we&apos;ve expanded its use in pganalyze itself. pganalyze now truncates query text in a smart manner in the query overview. The &lt;a href=&quot;https://github.com/pganalyze/collector&quot;&gt;pganalyze-collector&lt;/a&gt; supports collecting EXPLAIN plans, and uses pg_query to support log-based EXPLAIN. And we link together &lt;code &gt;pg_stat_statements&lt;/code&gt; and &lt;code &gt;auto_explain&lt;/code&gt; data in pganalyze using query fingerprints (another pg_query feature we&apos;ll discuss in detail in a later section).&lt;/p&gt;
&lt;h2 id=&quot;postgres-community-tools-build-on-pg_query&quot; &gt;&lt;a href=&quot;#postgres-community-tools-build-on-pg_query&quot; aria-label=&quot;postgres community tools build on pg_query permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Postgres community tools build on pg_query&lt;/h2&gt;
&lt;p&gt;But, what we didn&apos;t expect at the time, was the tremendous interest we&apos;ve seen from the community. &lt;strong&gt;The Ruby library alone has received over 3.5 million downloads in its lifetime.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Thanks to many contributors, pg_query now has bindings for other languages beyond Ruby and Go, such as Python (&lt;a href=&quot;https://pypi.org/project/pglast/&quot;&gt;pglast&lt;/a&gt;, maintained by &lt;a href=&quot;https://github.com/lelit&quot;&gt;Lele Gaifax&lt;/a&gt;), Node.js (&lt;a href=&quot;https://www.npmjs.com/package/pgsql-parser&quot;&gt;pgsql-parser&lt;/a&gt;, maintained by &lt;a href=&quot;https://github.com/pyramation&quot;&gt;Dan Lynch&lt;/a&gt;) and even OCaml. There are also many notable third-party projects that use pg_query to parse Postgres queries. Here are some of our favorites:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://sqlc.dev/&quot;&gt;sqlc&lt;/a&gt; provides type safe SQL-based databases access in Go&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://korban.net/posts/postgres/2017-09-18-debugging-complex-postgres-queries-with-pgdebug/&quot;&gt;pgdebug&lt;/a&gt; lets you debug complex CTEs and execute parts as a standalone query&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/cloudspannerecosystem/harbourbridge&quot;&gt;Google&apos;s HarbourBridge&lt;/a&gt; uses pg_query for helping customers trial Spanner from Postgres sources&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/cwida/duckdb/tree/master/third_party/libpg_query&quot;&gt;DuckDB&lt;/a&gt; uses a forked version of pg_query for their parsing layer&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gitlab.com/gitlab-org/gitlab/-/blob/a36e2684/Gemfile#L310&quot;&gt;GitLab&lt;/a&gt; uses pg_query for normalizing queries in their internal error reporting&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/splitgraph/splitgraph/blob/b1784d3a2009c3ee3a027372c46dcc730ce2ca78/splitgraph/core/sql/__init__.py&quot;&gt;Splitgraph&lt;/a&gt; uses pg_query via the pglast Python binding to parse the SQL statements in Splitfiles&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/purcell/sqlint&quot;&gt;sqlint&lt;/a&gt; lints your SQL files for correctness&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Today, it&apos;s time to bring pg_query to the next level.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id=&quot;announcing-pg_query-20-better--faster-parsing-with-postgres-13-support&quot; &gt;&lt;a href=&quot;#announcing-pg_query-20-better--faster-parsing-with-postgres-13-support&quot; aria-label=&quot;announcing pg_query 20 better  faster parsing with postgres 13 support permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Announcing pg_query 2.0: Better &amp;#x26; faster parsing, with Postgres 13 support&lt;/h2&gt;
&lt;p&gt;We&apos;re excited to announce the next major version of pg_query, &lt;strong&gt;pg_query 2.0.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this version, you&apos;ll find support for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Parsing the PostgreSQL 13 query syntax&lt;/li&gt;
&lt;li&gt;Deparser as part of the core C library, to turn modified parse trees back into SQL&lt;/li&gt;
&lt;li&gt;New parse tree format based on &lt;a href=&quot;https://developers.google.com/protocol-buffers&quot;&gt;Protocol Buffers&lt;/a&gt; (Protobuf)&lt;/li&gt;
&lt;li&gt;Improved, faster query fingerprinting mechanism&lt;/li&gt;
&lt;li&gt;And much more!&lt;/li&gt;
&lt;/ul&gt;
&lt;div &gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#postgres-community-tools-build-on-pg_query&quot;&gt;Postgres community tools build on pg_query&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#announcing-pg_query-20-better--faster-parsing-with-postgres-13-support&quot;&gt;Announcing pg_query 2.0: Better &amp;#x26; faster parsing, with Postgres 13 support&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#how-pg_query-turns-a-postgres-statement-into-a-parse-tree&quot;&gt;How pg_query turns a Postgres statement into a parse tree&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#using-libclang-to-extract-c-source-code-from-postgres&quot;&gt;Using LibClang to extract C source code from Postgres&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#turning-postgres-parser-c-structs-into-json-and-protobufs&quot;&gt;Turning Postgres parser C structs into JSON and Protobufs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#why-pg_query-20-adds-support-for-protocol-buffers&quot;&gt;Why pg_query 2.0 adds support for Protocol Buffers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#turning-parse-trees-back-into-sql-using-a-deparser&quot;&gt;Turning parse trees back into SQL using a deparser&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#the-pg_query-deparser-with-coverage-for-all-postgres-regression-tests&quot;&gt;The pg_query deparser with coverage for all Postgres regression tests&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#fingerprints-in-pg_query-a-better-way-to-check-if-two-queries-are-identical&quot;&gt;Fingerprints in pg_query: A better way to check if two queries are identical&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#why-did-we-create-our-own-query-fingerprint-concept&quot;&gt;Why did we create our own query fingerprint concept?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#additional-changes-for-pg_query-20&quot;&gt;Additional changes for pg_query 2.0&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;To start, let&apos;s revisit how pg_query actually works.&lt;/p&gt;
&lt;h2 id=&quot;how-pg_query-turns-a-postgres-statement-into-a-parse-tree&quot; &gt;&lt;a href=&quot;#how-pg_query-turns-a-postgres-statement-into-a-parse-tree&quot; aria-label=&quot;how pg_query turns a postgres statement into a parse tree permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How pg_query turns a Postgres statement into a parse tree&lt;/h2&gt;
&lt;p&gt;There are many ways to parse SQL, but the scope for pg_query is very specific. That is, to be able to parse the full Postgres query syntax, the same way as Postgres does. The only reliable way to do this, is to &lt;strong&gt;use the Postgres parser itself&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;pg_query isn&apos;t the first project to do this, for example pgpool has a copy of the Postgres parser as well. But we needed an easily maintainable, self-contained version of the parser in a standalone C library. This would let us, and the Postgres community, use the parser from almost any language by writing a simple wrapper.&lt;/p&gt;
&lt;p&gt;How did we do this? We started by looking at the Postgres source. &lt;a href=&quot;https://github.com/postgres/postgres/blob/REL_13_STABLE/src/backend/parser/parser.c#L42&quot;&gt;Looking at the source&lt;/a&gt;, you will find the function called &lt;code &gt;raw_parser&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * raw_parser
 *		Given a query in string form, do lexical and grammatical analysis.
 *
 * Returns a list of raw (un-analyzed) parse trees.  The immediate elements
 * of the list are always RawStmt nodes.
 */&lt;/span&gt;
List &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;raw_parser&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;const&lt;/span&gt; &lt;span &gt;char&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;str&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After raw parsing, Postgres goes into parse analysis. In that phase Postgres identifies the types of columns, maps table names to the schema and more. After that, Postgres does planning (see our introduction to &lt;a src=&quot;https://pganalyze.com/docs/explain/basics-of-postgres-query-planning&quot;&gt;Postgres query planning&lt;/a&gt;), and then executes the query based on the query plan.&lt;/p&gt;
&lt;p&gt;For pg_query, all we need is the raw parser. Looking at the code, &lt;strong&gt;we discovered a problem&lt;/strong&gt;. The parser code still depends on a lot of Postgres code, such as for memory management or error handling. We needed a repeatable way to extract just enough source code to compile and run the parser.&lt;/p&gt;
&lt;p&gt;Thus the idea was born to automatically extract the Postgres parser code and its dependencies.&lt;/p&gt;
&lt;h3 id=&quot;using-libclang-to-extract-c-source-code-from-postgres&quot; &gt;&lt;a href=&quot;#using-libclang-to-extract-c-source-code-from-postgres&quot; aria-label=&quot;using libclang to extract c source code from postgres permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using LibClang to extract C source code from Postgres&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Our goal:&lt;/strong&gt; A set of self-contained C files that represent a copy of Postgres&apos; &lt;code &gt;raw_parser&lt;/code&gt; function. But we don&apos;t want to copy the code manually. Luckily we can use &lt;a href=&quot;https://clang.llvm.org/docs/Tooling.html#libclang&quot;&gt;LibClang&lt;/a&gt; to parse C code, and understand its dependencies.&lt;/p&gt;
&lt;p&gt;The details of this could fill many pages, but here is a simplified version of how this works:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Each translation unit (.c file) in the source is analyzed via LibClang&apos;s Ruby binding:&lt;/strong&gt;&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;require&lt;/span&gt; &lt;span &gt;&apos;ffi/clang&apos;&lt;/span&gt;

index &lt;span &gt;=&lt;/span&gt; &lt;span &gt;FFI&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;Clang&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;:&lt;/span&gt;&lt;span &gt;Index&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;new&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;true&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;true&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
translation_unit &lt;span &gt;=&lt;/span&gt; index&lt;span &gt;.&lt;/span&gt;parse_translation_unit&lt;span &gt;(&lt;/span&gt;file&lt;span &gt;,&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;... CFLAGS ...&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;2. The analysis walks through the file and marks each C method, as well as the symbols it references:&lt;/strong&gt;&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;translation_unit&lt;span &gt;.&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;visit_children &lt;span &gt;do&lt;/span&gt; &lt;span &gt;|&lt;/span&gt;cursor&lt;span &gt;,&lt;/span&gt; parent&lt;span &gt;|&lt;/span&gt;
  &lt;span &gt;@file_to_symbol_positions&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;location&lt;span &gt;.&lt;/span&gt;file&lt;span &gt;]&lt;/span&gt; &lt;span &gt;||&lt;/span&gt;&lt;span &gt;=&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;
  &lt;span &gt;@file_to_symbol_positions&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;location&lt;span &gt;.&lt;/span&gt;file&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;spelling&lt;span &gt;]&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;extent&lt;span &gt;.&lt;/span&gt;start&lt;span &gt;.&lt;/span&gt;offset&lt;span &gt;,&lt;/span&gt; cursor&lt;span &gt;.&lt;/span&gt;extent&lt;span &gt;.&lt;/span&gt;&lt;span &gt;end&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;offset&lt;span &gt;]&lt;/span&gt;
  cursor&lt;span &gt;.&lt;/span&gt;visit_children &lt;span &gt;do&lt;/span&gt; &lt;span &gt;|&lt;/span&gt;child_cursor&lt;span &gt;,&lt;/span&gt; parent&lt;span &gt;|&lt;/span&gt;
    &lt;span &gt;if&lt;/span&gt; child_cursor&lt;span &gt;.&lt;/span&gt;kind &lt;span &gt;==&lt;/span&gt; &lt;span &gt;:cursor_decl_ref_expr&lt;/span&gt; &lt;span &gt;||&lt;/span&gt; child_cursor&lt;span &gt;.&lt;/span&gt;kind &lt;span &gt;==&lt;/span&gt; &lt;span &gt;:cursor_call_expr&lt;/span&gt;
      &lt;span &gt;@references&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;spelling&lt;span &gt;]&lt;/span&gt; &lt;span &gt;||&lt;/span&gt;&lt;span &gt;=&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;
      &lt;span &gt;(&lt;/span&gt;&lt;span &gt;@references&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;cursor&lt;span &gt;.&lt;/span&gt;spelling&lt;span &gt;]&lt;/span&gt; &lt;span &gt;&amp;lt;&lt;/span&gt;&lt;span &gt;&amp;lt;&lt;/span&gt; child_cursor&lt;span &gt;.&lt;/span&gt;spelling&lt;span &gt;)&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;uniq&lt;span &gt;!&lt;/span&gt;
    &lt;span &gt;end&lt;/span&gt;
    &lt;span &gt;:recurse&lt;/span&gt;
  &lt;span &gt;end&lt;/span&gt;
&lt;span &gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;3. We resolve required C methods and their code, based on the top-level method we are looking for:&lt;/strong&gt;&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;def&lt;/span&gt; &lt;span &gt;&lt;span &gt;deep_resolve&lt;/span&gt;&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;method_name&lt;span &gt;,&lt;/span&gt; depth&lt;span &gt;:&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; trail&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; global_resolved_by_parent&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; static_resolved_by_parent&lt;span &gt;:&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; static_base_filename&lt;span &gt;:&lt;/span&gt; &lt;span &gt;nil&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
  global_dependents &lt;span &gt;=&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;@references&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;method_name&lt;span &gt;]&lt;/span&gt; &lt;span &gt;||&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;
  global_dependents&lt;span &gt;.&lt;/span&gt;&lt;span &gt;each&lt;/span&gt; &lt;span &gt;do&lt;/span&gt; &lt;span &gt;|&lt;/span&gt;symbol&lt;span &gt;|&lt;/span&gt;
    deep_resolve&lt;span &gt;(&lt;/span&gt;symbol&lt;span &gt;,&lt;/span&gt; depth&lt;span &gt;:&lt;/span&gt; depth &lt;span &gt;+&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; trail&lt;span &gt;:&lt;/span&gt; trail &lt;span &gt;+&lt;/span&gt; &lt;span &gt;[&lt;/span&gt;method_name&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; global_resolved_by_parent&lt;span &gt;:&lt;/span&gt; global_resolved_by_parent &lt;span &gt;+&lt;/span&gt; global_dependents&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;end&lt;/span&gt;
  &lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
&lt;span &gt;end&lt;/span&gt;

deep_resolve&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;raw_parser&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;4. We write out just the portions of the C code that are required (see &lt;a href=&quot;https://github.com/pganalyze/libpg_query/blob/6517eedf6c3c6c53a14ecd8f01410bb8fc3c8ec1/scripts/extract_source.rb#L354&quot;&gt;details here&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With this, we have a working Postgres parser!&lt;/p&gt;
&lt;p&gt;You can find the full details in the &lt;a href=&quot;https://github.com/pganalyze/libpg_query/blob/13-latest/scripts/extract_source.rb&quot;&gt;pg_query source&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once we can call the Postgres parser in our standalone library, we can get the result as a parse tree, represented as Postgres parser C structs. But now we needed to make this useful in other languages, such as Ruby or Go.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&quot;turning-postgres-parser-c-structs-into-json-and-protobufs&quot; &gt;&lt;a href=&quot;#turning-postgres-parser-c-structs-into-json-and-protobufs&quot; aria-label=&quot;turning postgres parser c structs into json and protobufs permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Turning Postgres parser C structs into JSON and Protobufs&lt;/h3&gt;
&lt;p&gt;It&apos;s a little known fact, but Postgres actually has a text representation of a query parse tree. Its rarely used directly, being reserved for internal communication and debugging. The easiest way to see an example is by looking at the &lt;code &gt;adbin&lt;/code&gt; field in &lt;a href=&quot;https://www.postgresql.org/docs/13/catalog-pg-attrdef.html&quot;&gt;pg_attref&lt;/a&gt;, which shows the internal representation for an expression of an column default value (to contrast, &lt;code &gt;pg_get_expr&lt;/code&gt; shows the expression in SQL):&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; adbin&lt;span &gt;,&lt;/span&gt; pg_get_expr&lt;span &gt;(&lt;/span&gt;adbin&lt;span &gt;,&lt;/span&gt; adrelid&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_attrdef &lt;span &gt;WHERE&lt;/span&gt; adrelid &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;mytable&apos;&lt;/span&gt;::regclass &lt;span &gt;AND&lt;/span&gt; adnum &lt;span &gt;=&lt;/span&gt; &lt;span &gt;1&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
adbin       | {FUNCEXPR :funcid 480 :funcresulttype 23 :funcretset false :funcvariadic false :funcformat 2 :funccollid 0 :inputcollid 0 :args ({FUNCEXPR :funcid 1574 :funcresulttype 20 :funcretset false :funcvariadic false :funcformat 0 :funccollid 0 :inputcollid 0 :args ({CONST :consttype 2205 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location 68 :constvalue 4 [ -27 10 -122 1 0 0 0 0 ]}) :location 60}) :location -1}
pg_get_expr | nextval(&apos;mytable_id_seq&apos;::regclass)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This text format is not useful for working with a parse tree in other languages. Thus, we needed a more portable format to export the parse tree from C, and import it in another language such as Ruby.&lt;/p&gt;
&lt;p&gt;The initial version of pg_query used JSON for this. JSON is great, since you can parse it in pretty much any programming language. Thus, in this new pg_query release, we still support JSON.&lt;/p&gt;
&lt;p&gt;We&apos;re also introducing support for a new schema-based format, using Protocol Buffers (Protobuf).&lt;/p&gt;
&lt;h3 id=&quot;why-pg_query-20-adds-support-for-protocol-buffers&quot; &gt;&lt;a href=&quot;#why-pg_query-20-adds-support-for-protocol-buffers&quot; aria-label=&quot;why pg_query 20 adds support for protocol buffers permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why pg_query 2.0 adds support for Protocol Buffers&lt;/h3&gt;
&lt;p&gt;Whilst JSON is convenient for passing around the parse tree, it has a few problems:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;JSON is slower to parse than a binary format&lt;/li&gt;
&lt;li&gt;Memory usage can become an issue with complex parse trees&lt;/li&gt;
&lt;li&gt;Building logic around a tree of JSON data is error-prone, as one needs to add a lot of checks to identify each node and its supported fields&lt;/li&gt;
&lt;li&gt;It&apos;s hard to instantiate new parse tree nodes, for example to use for deparsing back into a SQL statement&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In pg_query 1.0, accessing the value of a &quot;SELECT 1&quot; would have looked like this with the Ruby binding:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;result &lt;span &gt;=&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT 1&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
result&lt;span &gt;.&lt;/span&gt;tree&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;RawStmt&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;stmt&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;SelectStmt&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;targetList&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;ResTarget&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&apos;val&apos;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;
&lt;span &gt;# =&gt; {&quot;A_Const&quot;=&gt;{&quot;val&quot;=&gt;{&quot;Integer&quot;=&gt;{&quot;ival&quot;=&gt;1}}, &quot;location&quot;=&gt;7}}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here is how Protobuf improves the parse tree handling in Ruby:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;result &lt;span &gt;=&lt;/span&gt; &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT 1&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
result&lt;span &gt;.&lt;/span&gt;tree&lt;span &gt;.&lt;/span&gt;stmts&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;stmt&lt;span &gt;.&lt;/span&gt;select_stmt&lt;span &gt;.&lt;/span&gt;target_list&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;res_target&lt;span &gt;.&lt;/span&gt;val&lt;span &gt;.&lt;/span&gt;a_const
&lt;span &gt;# =&gt; &amp;lt;PgQuery::A_Const: val: &amp;lt;PgQuery::Node: integer: &amp;lt;PgQuery::Integer: ival: 1&gt;&gt;, location: 7&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note how we have a full class definition for each parse tree node type, making interaction with the tree nodes significantly easier.&lt;/p&gt;
&lt;p&gt;Now, let&apos;s say I want to change a parse tree and turn it back into a SQL statement. For this, I need a deparser.&lt;/p&gt;
&lt;h2 id=&quot;turning-parse-trees-back-into-sql-using-a-deparser&quot; &gt;&lt;a href=&quot;#turning-parse-trees-back-into-sql-using-a-deparser&quot; aria-label=&quot;turning parse trees back into sql using a deparser permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Turning parse trees back into SQL using a deparser&lt;/h2&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/ecd4126fc167bfe7a01adca9808529e3/aa440/deparser.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Illustration of deparsing Postgres parse tree back into SELECT statement&quot;
        title=&quot;Illustration of deparsing Postgres parse tree back into SELECT statement&quot;
        src=&quot;https://pganalyze.com/static/ecd4126fc167bfe7a01adca9808529e3/1d69c/deparser.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Postgres itself has deparser logic in many places. For example postgres_fdw has a deparser to generate the query to send to the remote server. But, the deparser code in Postgres requires a post-parse analysis parse tree (that directly references relation OIDs, etc). That means we can&apos;t make use of it in pg_query, which works with raw parse trees.&lt;/p&gt;
&lt;p&gt;For many years now, the Ruby pg_query library has had a deparser. Over the years we&apos;ve had many community contributions to make it complete. The third-party libraries for Python and Node.js also have their own deparser. These efforts were all done in parallel, without sharing code. And the Go library is missing a deparser altogether.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How can we reduce the duplicated effort in the community?&lt;/strong&gt; By creating a new portable deparser for raw parse trees. This avoids having duplicate efforts for every pg_query-based library.&lt;/p&gt;
&lt;h3 id=&quot;the-pg_query-deparser-with-coverage-for-all-postgres-regression-tests&quot; &gt;&lt;a href=&quot;#the-pg_query-deparser-with-coverage-for-all-postgres-regression-tests&quot; aria-label=&quot;the pg_query deparser with coverage for all postgres regression tests permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The pg_query deparser with coverage for all Postgres regression tests&lt;/h3&gt;
&lt;p&gt;pg_query 2.0 features a new deparser, written in C. This was by far the biggest undertaking of this new release. The new deparser is able to generate all SQL queries used in the Postgres regression tests (which the pg_query parser can of course parse), and more.&lt;/p&gt;
&lt;p&gt;It works like this, here by example of the Go library, which before did not have a deparser:&lt;/p&gt;
&lt;div  data-language=&quot;go&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;package&lt;/span&gt; main

&lt;span &gt;import&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;
  &lt;span &gt;&quot;fmt&quot;&lt;/span&gt;
  pg_query &lt;span &gt;&quot;github.com/pganalyze/pg_query_go/v2&quot;&lt;/span&gt;
&lt;span &gt;)&lt;/span&gt;

&lt;span &gt;func&lt;/span&gt; &lt;span &gt;main&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
  &lt;span &gt;// Parse a query&lt;/span&gt;
  result&lt;span &gt;,&lt;/span&gt; err &lt;span &gt;:=&lt;/span&gt; pg_query&lt;span &gt;.&lt;/span&gt;&lt;span &gt;Parse&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT 42&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;if&lt;/span&gt; err &lt;span &gt;!=&lt;/span&gt; &lt;span &gt;nil&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;panic&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;err&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;}&lt;/span&gt;

  &lt;span &gt;// Modify the parse tree&lt;/span&gt;
  result&lt;span &gt;.&lt;/span&gt;Stmts&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;Stmt&lt;span &gt;.&lt;/span&gt;&lt;span &gt;GetSelectStmt&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;GetTargetList&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;GetResTarget&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;Val &lt;span &gt;=&lt;/span&gt;
    pg_query&lt;span &gt;.&lt;/span&gt;&lt;span &gt;MakeAConstStrNode&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;Hello World&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;-&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;

  &lt;span &gt;// Deparse back into a query&lt;/span&gt;
  stmt&lt;span &gt;,&lt;/span&gt; err &lt;span &gt;:=&lt;/span&gt; pg_query&lt;span &gt;.&lt;/span&gt;&lt;span &gt;Deparse&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;result&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;if&lt;/span&gt; err &lt;span &gt;!=&lt;/span&gt; &lt;span &gt;nil&lt;/span&gt; &lt;span &gt;{&lt;/span&gt;
    &lt;span &gt;panic&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;err&lt;span &gt;)&lt;/span&gt;
  &lt;span &gt;}&lt;/span&gt;
  fmt&lt;span &gt;.&lt;/span&gt;&lt;span &gt;Printf&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;%s\n&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; stmt&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will output the following:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;SELECT &apos;Hello World&apos;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;First, the deparsing step encodes the Go structs into the new Protobuf format. Then, the C library decodes this into Postgres parse tree C structs. Last but not least, the C library&apos;s new deparser turns the C structs into the SQL query text.&lt;/p&gt;
&lt;p&gt;Stepping away from deparsing, let&apos;s take a look at the new fingerprinting mechanism:&lt;/p&gt;
&lt;h2 id=&quot;fingerprints-in-pg_query-a-better-way-to-check-if-two-queries-are-identical&quot; &gt;&lt;a href=&quot;#fingerprints-in-pg_query-a-better-way-to-check-if-two-queries-are-identical&quot; aria-label=&quot;fingerprints in pg_query a better way to check if two queries are identical permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Fingerprints in pg_query: A better way to check if two queries are identical&lt;/h2&gt;
&lt;p&gt;Let&apos;s start with the motivation for query fingerprints. pganalyze needs to link together Postgres statistics across different data sources. For example queries from &lt;code &gt;pg_stat_statements&lt;/code&gt; with the Postgres &lt;code &gt;auto_explain&lt;/code&gt; logs. You can see the fingerprint in pganalyze on the query details page:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/f1ff8b6c20366e040045fc97d416895b/acf8f/pganalyze_query_details.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;pganalyze Query Details page showing a query and its associated fingerprint value&quot;
        title=&quot;pganalyze Query Details page showing a query and its associated fingerprint value&quot;
        src=&quot;https://pganalyze.com/static/f1ff8b6c20366e040045fc97d416895b/1d69c/pganalyze_query_details.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;This query can be represented differently depending on which part of Postgres you look at:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;pg_stat_statements: &lt;code &gt;SELECT &quot;abalance&quot; FROM &quot;pgbench_accounts&quot; WHERE &quot;aid&quot; = ?&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;auto_explain: &lt;code &gt;SELECT abalance FROM pgbench_accounts WHERE aid = 4674588&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A simple text comparison would not be sufficient to determine that these queries are identical.&lt;/p&gt;
&lt;h3 id=&quot;why-did-we-create-our-own-query-fingerprint-concept&quot; &gt;&lt;a href=&quot;#why-did-we-create-our-own-query-fingerprint-concept&quot; aria-label=&quot;why did we create our own query fingerprint concept permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why did we create our own query fingerprint concept?&lt;/h3&gt;
&lt;p&gt;Postgres already has the concept of a &quot;queryid&quot;, calculated based on the post-parse analysis tree. It&apos;s used in places such as &lt;code &gt;pg_stat_statements&lt;/code&gt; to distinguish the different query entries.&lt;/p&gt;
&lt;p&gt;But, this queryid is not available everywhere today, e.g. you can&apos;t get it with &lt;code &gt;auto_explain&lt;/code&gt; plans. It&apos;s also not portable between databases, as it&apos;s dependent on specific relation OIDs. Even if you have the exact same queries on your staging and production system, they will have different queryid values. And the queryid can&apos;t be generated outside the context of a Postgres server. Thus, pganalyze has its own mechanism, called a query fingerprint. &lt;/p&gt;
&lt;p&gt;Fingerprints identify a Postgres query based on its raw parse tree alone. We&apos;ve open-sourced this mechanism in pg_query:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;fingerprint&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;SELECT a, b FROM c&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;# =&gt; &quot;fb1f305bea85c2f6&quot;&lt;/span&gt;

&lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;fingerprint&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;SELECT b, a FROM c&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;# =&gt; &quot;fb1f305bea85c2f6&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This mechanism does not need a running server, so all you need as input is a valid Postgres query.&lt;/p&gt;
&lt;p&gt;With pg_query 2.0, we&apos;ve done a few enhancements to the fingerprint functionality:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use the faster XXH3 hash function, instead of SHA-1.&lt;/strong&gt; pg_query 1.0 used the outdated cryptographic hash function SHA-1. Cryptographic guarantees are not needed for this use case, and XXH3 is much faster.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contain the fingerprint in a 64-bit value, instead of 136 bits.&lt;/strong&gt; We&apos;ve determined that 64-bit precision is good enough for query fingerprints. Postgres itself thinks so too, since it uses 64-bit for the Postgres queryid. We often use data from &lt;code &gt;pg_stat_statements&lt;/code&gt;, so there is little benefit to more bits. Using a smaller data type also means better performance for pganalyze.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fix edge cases where two almost identical queries had different fingerprints&lt;/strong&gt;. Fingerprints should ignore query differences, when they result in the same query intent. We&apos;ve addressed a few cases where this was not working as expected. You can look at the corresponding &lt;a href=&quot;https://github.com/pganalyze/libpg_query/wiki/Fingerprinting#version-30-based-on-postgresql-13&quot;&gt;wiki page&lt;/a&gt; to understand these changes in more detail.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&quot;additional-changes-for-pg_query-20&quot; &gt;&lt;a href=&quot;#additional-changes-for-pg_query-20&quot; aria-label=&quot;additional changes for pg_query 20 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Additional changes for pg_query 2.0&lt;/h2&gt;
&lt;p&gt;A few other things about the new release:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The pg_query library now resides in the pganalyze organization on GitHub. This makes it clear who maintains and funds the core development. We will continue to make pg_query available under the BSD 3-clause license.&lt;/li&gt;
&lt;li&gt;pg_query has a new method for splitting queries. This can be useful when you want to split a multi-statement string into its component statements, for example &lt;code &gt;SELECT &apos;;&apos;; SELECT &apos;foo&apos;&lt;/code&gt; into &lt;code &gt;SELECT &apos;;&apos;&lt;/code&gt; and &lt;code &gt;SELECT &apos;foo&apos;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;There is a new function available to access the Postgres scanner. This includes the location of comments in a query text. One could envision building a syntax highlighter based on this. Or extract comments from queries whilst ignoring comment-like tokens in a constant value.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The new pg_query 2.0 is &lt;a href=&quot;https://github.com/pganalyze/libpg_query&quot;&gt;available today&lt;/a&gt;, with bindings for &lt;a href=&quot;https://github.com/pganalyze/pg_query_go&quot;&gt;Go&lt;/a&gt; and &lt;a href=&quot;https://github.com/pganalyze/pg_query&quot;&gt;Ruby&lt;/a&gt; available to start. We are also working on a new pganalyze-maintained Rust binding that we&apos;ll have news about soon.&lt;/p&gt;
&lt;p&gt;Help us get the word out by &lt;a href=&quot;https://twitter.com/intent/tweet?text=Introducing%20pg_query%202.0%20-%20The%20easiest%20way%20to%20parse%20Postgres%20queries%3A%20https://pganalyze.com/blog/pg-query-2-0-postgres-query-parser&quot;&gt;sharing this post on Twitter&lt;/a&gt;.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Postgres 11: Monitoring JIT performance, Auto Prewarm & Stored Procedures]]></title><description><![CDATA[Everyone’s favorite database, PostgreSQL, has a new release coming out soon: Postgres 11 In this post we take a look at some of the new features that are part of the release, and in particular review the things you may need to monitor, or can utilize to increase your application and query performance.  Just-In-Time compilation (JIT) in Postgres 11 Just-In-Time compilation (JIT) for query execution was added in Postgres 11. It's not going to be enabled for queries by default, similar to parallel…]]></description><link>https://pganalyze.com/blog/postgres11-jit-compilation-auto-prewarm-sql-stored-procedures</link><guid isPermaLink="false">https://pganalyze.com/blog/postgres11-jit-compilation-auto-prewarm-sql-stored-procedures</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Thu, 04 Oct 2018 12:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Everyone’s favorite database, PostgreSQL, has a new release coming out soon: &lt;strong&gt;&lt;a href=&quot;https://www.postgresql.org/docs/11/static/release-11.html&quot;&gt;Postgres 11&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this post we take a look at some of the new features that are part of the release, and in particular review the things you may need to monitor, or can utilize to increase your application and query performance.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/f510dbb938b762fb8a629528636a45d6/09ede/jit_performance.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;JIT Performance in Postgres 11&quot;
        title=&quot;JIT Performance in Postgres 11&quot;
        src=&quot;https://pganalyze.com/static/f510dbb938b762fb8a629528636a45d6/1d69c/jit_performance.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;h2 id=&quot;just-in-time-compilation-jit-in-postgres-11&quot; &gt;&lt;a href=&quot;#just-in-time-compilation-jit-in-postgres-11&quot; aria-label=&quot;just in time compilation jit in postgres 11 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Just-In-Time compilation (JIT) in Postgres 11&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://www.postgresql.org/docs/11/static/jit-reason.html&quot;&gt;Just-In-Time compilation (JIT)&lt;/a&gt; for query execution was added in Postgres 11. It&apos;s not going to be enabled for queries by default, similar to parallel query in Postgres 9.6, but can be very helpful for CPU-bound workloads and analytical queries.&lt;/p&gt;
&lt;p&gt;Specifically, JIT currently aims to optimize two essential parts of query execution: Expression evaluation and tuple deforming. To quote the &lt;a href=&quot;https://www.postgresql.org/docs/11/static/jit-reason.html&quot;&gt;Postgres documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Expression evaluation&lt;/strong&gt; is used to evaluate WHERE clauses, target lists, aggregates and projections. It can be accelerated by generating code specific to each case.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tuple deforming&lt;/strong&gt; is the process of transforming an on-disk tuple into its in-memory representation. It can be accelerated by creating a function specific to the table layout and the number of columns to be extracted.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Often you will have a workload that is mixed, where some queries will benefit from JIT, and some will be slowed down by the overhead.&lt;/p&gt;
&lt;p&gt;Here is how you can monitor JIT performance using EXPLAIN and &lt;code &gt;auto_explain&lt;/code&gt;, as well as how you can determine whether your queries are benefiting from JIT optimization.&lt;/p&gt;
&lt;h3 id=&quot;monitoring-jit-with-explain--auto_explain&quot; &gt;&lt;a href=&quot;#monitoring-jit-with-explain--auto_explain&quot; aria-label=&quot;monitoring jit with explain  auto_explain permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Monitoring JIT with EXPLAIN / auto_explain&lt;/h3&gt;
&lt;p&gt;First of all, you will need to make sure that your Postgres packages are compiled with JIT support (&lt;code &gt;--with-llvm&lt;/code&gt; configuration switch). Assuming that you have Postgres binaries compiled like that, the &lt;code &gt;jit&lt;/code&gt; configuration parameter controls whether JIT is actually being used.&lt;/p&gt;
&lt;p&gt;For this example, we’re working with one of our staging databases, and pick a relatively simple query that can benefit from JIT:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;COUNT&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; log_lines
 &lt;span &gt;WHERE&lt;/span&gt; log_classification &lt;span &gt;=&lt;/span&gt; &lt;span &gt;65&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;details&lt;span &gt;-&lt;/span&gt;&lt;span &gt;&gt;&gt;&lt;/span&gt;&lt;span &gt;&apos;new_dead_tuples&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;::&lt;span &gt;integer&lt;/span&gt; &lt;span &gt;&gt;=&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For context, the table &lt;code &gt;log_lines&lt;/code&gt; is an internal log event statistics table of pganalyze, which is typically indexed per-server, but in this case we want to run an analytical query across all servers to count interesting &lt;a src=&quot;https://pganalyze.com/docs/log-insights/autovacuum/A65&quot;&gt;autovacuum completed log events&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;First, if we run the query with &lt;code &gt;jit = off&lt;/code&gt;, we will get an execution plan and runtime like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;ANALYZE&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; BUFFERS&lt;span &gt;)&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;COUNT&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; log_lines
    &lt;span &gt;WHERE&lt;/span&gt; log_classification &lt;span &gt;=&lt;/span&gt; &lt;span &gt;65&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;details&lt;span &gt;-&lt;/span&gt;&lt;span &gt;&gt;&gt;&lt;/span&gt;&lt;span &gt;&apos;new_dead_tuples&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;::&lt;span &gt;integer&lt;/span&gt; &lt;span &gt;&gt;=&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                        QUERY PLAN                                                        │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=649724.03..649724.04 rows=1 width=8) (actual time=3498.939..3498.939 rows=1 loops=1)                    │
│   Buffers: shared hit=1538 read=386328                                                                                   │
│   I/O Timings: read=1098.036                                                                                             │
│   -&amp;gt;  Seq Scan on log_lines  (cost=0.00..649675.55 rows=19393 width=0) (actual time=0.028..3437.032 rows=667063 loops=1) │
│         Filter: ((log_classification = 65) AND (((details -&amp;gt;&amp;gt; &amp;#39;new_dead_tuples&amp;#39;::text))::integer &amp;gt;= 0))                  │
│         Rows Removed by Filter: 14396065                                                                                 │
│         Buffers: shared hit=1538 read=386328                                                                             │
│         I/O Timings: read=1098.036                                                                                       │
│ Planning Time: 0.095 ms                                                                                                  │
│ Execution Time: 3499.089 ms                                                                                              │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

Time: 3499.580 ms (00:03.500)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note the usage of EXPLAIN&apos;s &lt;code &gt;BUFFERS&lt;/code&gt; option so we can compare whether any caching behavior affects our benchmarking. We can also see that I/O time was 1,098 ms out of 3,499 ms, so this query is definitely CPU bound.&lt;/p&gt;
&lt;p&gt;For comparison, when we enable JIT, we can see the following:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SET&lt;/span&gt; jit &lt;span &gt;=&lt;/span&gt; &lt;span &gt;on&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;EXPLAIN&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;&lt;span &gt;ANALYZE&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; BUFFERS&lt;span &gt;)&lt;/span&gt; &lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;COUNT&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;*&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; log_lines
    &lt;span &gt;WHERE&lt;/span&gt; log_classification &lt;span &gt;=&lt;/span&gt; &lt;span &gt;65&lt;/span&gt; &lt;span &gt;AND&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;details&lt;span &gt;-&lt;/span&gt;&lt;span &gt;&gt;&gt;&lt;/span&gt;&lt;span &gt;&apos;new_dead_tuples&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;::&lt;span &gt;integer&lt;/span&gt; &lt;span &gt;&gt;=&lt;/span&gt; &lt;span &gt;0&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                        QUERY PLAN                                                         │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=649724.03..649724.04 rows=1 width=8) (actual time=2816.497..2816.498 rows=1 loops=1)                     │
│   Buffers: shared hit=1570 read=386296                                                                                    │
│   I/O Timings: read=1154.438                                                                                              │
│   -&amp;gt;  Seq Scan on log_lines  (cost=0.00..649675.55 rows=19393 width=0) (actual time=78.912..2759.717 rows=667063 loops=1) │
│         Filter: ((log_classification = 65) AND (((details -&amp;gt;&amp;gt; &amp;#39;new_dead_tuples&amp;#39;::text))::integer &amp;gt;= 0))                   │
│         Rows Removed by Filter: 14396065                                                                                  │
│         Buffers: shared hit=1570 read=386296                                                                              │
│         I/O Timings: read=1154.438                                                                                        │
│ Planning Time: 0.095 ms                                                                                                   │
│ JIT:                                                                                                                      │
│   Functions: 4                                                                                                            │
│   Options: Inlining true, Optimization true, Expressions true, Deforming true                                             │
│   Timing: Generation 1.044 ms, Inlining 14.205 ms, Optimization 46.678 ms, Emission 17.868 ms, Total 79.795 ms            │
│ Execution Time: 2817.713 ms                                                                                               │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(14 rows)

Time: 2818.250 ms (00:02.818)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this case, JIT yields about a &lt;strong&gt;25%&lt;/strong&gt; speed-up, due to spending less CPU time, without any extra effort on our end. We can also see that JIT tasks themselves added 79 ms to the runtime.&lt;/p&gt;
&lt;p&gt;You can fine tune whether JIT is used for a particular query by the &lt;code &gt;jit_above_cost&lt;/code&gt; parameter which applies to the total cost of the query as determined by the Postgres planner. The cost is &lt;code &gt;649724&lt;/code&gt; in the above EXPLAIN output, which exceeds the default &lt;code &gt;jit_above_cost&lt;/code&gt; threshold of &lt;code &gt;100000&lt;/code&gt;. In a future post we&apos;ll walk through more examples of when using JIT can be beneficial.&lt;/p&gt;
&lt;p&gt;You can gather these JIT statistics either for individual queries that you are interested in (using EXPLAIN), or automatically collect it for all of your queries using the &lt;code &gt;auto_explain&lt;/code&gt; extension. If you want to learn more about how to enable &lt;code &gt;auto_explain&lt;/code&gt; we recommend reviewing our guide about it: &lt;a src=&quot;https://pganalyze.com/docs/log-insights/setup/tuning-log-config-settings&quot;&gt;pganalyze Log Insights - Tuning Log Config Settings&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fun fact:&lt;/strong&gt; As part of the writing of this article we ran experiments with JIT and &lt;code &gt;auto_explain&lt;/code&gt;, and discovered that JIT information wasn’t included with &lt;code &gt;auto_explain&lt;/code&gt;, but only with regular EXPLAINs. Luckily, we were able to &lt;a href=&quot;https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b076eb7669d7279d0f446305c2e12dffd6bc3347&quot;&gt;contribute a bug fix to Postgres&lt;/a&gt;, which has been merged and will be part of the Postgres 11 release.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;preventing-cold-caches-auto-prewarm-in-postgres-11&quot; &gt;&lt;a href=&quot;#preventing-cold-caches-auto-prewarm-in-postgres-11&quot; aria-label=&quot;preventing cold caches auto prewarm in postgres 11 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Preventing cold caches: Auto prewarm in Postgres 11&lt;/h2&gt;
&lt;p&gt;A neat feature that will help you improve performance right after restarting Postgres, is the new autoprewarm background worker functionality.&lt;/p&gt;
&lt;p&gt;If you are not familiar with &lt;a href=&quot;https://www.postgresql.org/docs/11/static/pgprewarm.html&quot;&gt;pg_prewarm&lt;/a&gt;, its an extension thats bundled with Postgres (much like &lt;code &gt;pg_stat_statements&lt;/code&gt;), that you can use to preload data that’s on disk into the Postgres buffer cache.&lt;/p&gt;
&lt;p&gt;It is often very useful to ensure that a certain table is cached before the first production query hits the database, to avoid an overly slow response due to data being loaded from disk.&lt;/p&gt;
&lt;p&gt;Previously, you needed to manually specify which relations (i.e. tables) and which page offsets to preload, which was cumbersome, and hard to automate.&lt;/p&gt;
&lt;h3 id=&quot;caching-tables-with-autoprewarm&quot; &gt;&lt;a href=&quot;#caching-tables-with-autoprewarm&quot; aria-label=&quot;caching tables with autoprewarm permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Caching tables with autoprewarm&lt;/h3&gt;
&lt;p&gt;Starting in Postgres 11, you can instead have this done automatically, by adding &lt;code &gt;pg_prewarm&lt;/code&gt; to &lt;code &gt;shared_preload_libraries&lt;/code&gt; like this:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;shared_preload_libraries = &amp;#39;pg_prewarm,pg_stat_statements&amp;#39;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Doing this will automatically save information on which tables/indices are in the buffer cache (and which parts of them) every 300 seconds to a file called &lt;code &gt;autoprewarm.blocks&lt;/code&gt;, and use that information after Postgres restarts to reload the previously cached data from disk into the buffer cache, thus improving initial query performance.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&quot;stored-procedures-in-postgres-11&quot; &gt;&lt;a href=&quot;#stored-procedures-in-postgres-11&quot; aria-label=&quot;stored procedures in postgres 11 permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Stored procedures in Postgres 11&lt;/h2&gt;
&lt;p&gt;Postgres has had database server-side functions for a long time, with a variety of supported languages. You might have used the term “procedures” before to refer to such functions, as they are similar to what’s called “Stored Procedures” in other database systems such as Oracle.&lt;/p&gt;
&lt;p&gt;However, one detail that is sometimes missed, is that the existing functions in Postgres were always running within the same transaction. There was no way to begin, commit, or rollback a transaction within a function, as they were not allowed to run outside of a transaction context.&lt;/p&gt;
&lt;p&gt;Starting in Postgres 11, you will have the ability to use &lt;code &gt;CREATE PROCEDURE&lt;/code&gt; instead of &lt;code &gt;CREATE FUNCTION&lt;/code&gt; to create procedures.&lt;/p&gt;
&lt;h3 id=&quot;benefits-of-using-stored-procedures&quot; &gt;&lt;a href=&quot;#benefits-of-using-stored-procedures&quot; aria-label=&quot;benefits of using stored procedures permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Benefits of using stored procedures&lt;/h3&gt;
&lt;p&gt;Compared to regular functions, procedures can do more than just query or modify data: They also have the ability to begin/commit/rollback transactions within the procedure.&lt;/p&gt;
&lt;p&gt;Particularly for those moving over from Oracle to PostgreSQL, the new procedure functionality can be a significant time saver. You can find some examples of how to convert procedures between those two relational database systems in the &lt;a href=&quot;https://www.postgresql.org/docs/11/static/plpgsql-porting.html&quot;&gt;Postgres documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;how-to-use-stored-procedures&quot; &gt;&lt;a href=&quot;#how-to-use-stored-procedures&quot; aria-label=&quot;how to use stored procedures permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How to use stored procedures&lt;/h3&gt;
&lt;p&gt;First, let’s create a simple procedure that handles some tables:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;PROCEDURE&lt;/span&gt; my_table_task&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;LANGUAGE&lt;/span&gt; plpgsql &lt;span &gt;AS&lt;/span&gt; $$
&lt;span &gt;DECLARE&lt;/span&gt;
&lt;span &gt;BEGIN&lt;/span&gt;
  &lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; table_committed &lt;span &gt;(&lt;/span&gt;id &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;COMMIT&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; table_rolled_back &lt;span &gt;(&lt;/span&gt;id &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;ROLLBACK&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;END&lt;/span&gt; $$&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can then call this procedure like this, using the new CALL statement:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;=# CALL my_table_task();
CALL
Time: 1.573 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here you can see the benefit of procedures - despite the rollback the overall execution is successful, and the first table got created, but the second one was not since the transaction was rolled back.&lt;/p&gt;
&lt;h3 id=&quot;be-careful-transaction-timestamps-and-xact_start-for-procedures&quot; &gt;&lt;a href=&quot;#be-careful-transaction-timestamps-and-xact_start-for-procedures&quot; aria-label=&quot;be careful transaction timestamps and xact_start for procedures permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Be careful: Transaction timestamps and xact_start for procedures&lt;/h3&gt;
&lt;p&gt;Expanding on how transactions work inside procedures, there is currently an oddity with the transaction timestamp, which for example you can see in &lt;code &gt;xact_start&lt;/code&gt;. When we expand the procedure like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;PROCEDURE&lt;/span&gt; my_table_task&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;LANGUAGE&lt;/span&gt; plpgsql &lt;span &gt;AS&lt;/span&gt; $$
&lt;span &gt;DECLARE&lt;/span&gt;
  clock_str &lt;span &gt;TEXT&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  tx_str &lt;span &gt;TEXT&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;BEGIN&lt;/span&gt;
  &lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; table_committed &lt;span &gt;(&lt;/span&gt;id &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
    &lt;span &gt;SELECT&lt;/span&gt; clock_timestamp&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; clock_str&lt;span &gt;;&lt;/span&gt;
    &lt;span &gt;SELECT&lt;/span&gt; transaction_timestamp&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; tx_str&lt;span &gt;;&lt;/span&gt;
    RAISE NOTICE &lt;span &gt;&apos;After 1st CREATE TABLE: % clock, % xact&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; clock_str&lt;span &gt;,&lt;/span&gt; tx_str&lt;span &gt;;&lt;/span&gt;
    PERFORM pg_sleep&lt;span &gt;(&lt;/span&gt;&lt;span &gt;5&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;COMMIT&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;CREATE&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; table_rolled_back &lt;span &gt;(&lt;/span&gt;id &lt;span &gt;int&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
    &lt;span &gt;SELECT&lt;/span&gt; clock_timestamp&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; clock_str&lt;span &gt;;&lt;/span&gt;
    &lt;span &gt;SELECT&lt;/span&gt; transaction_timestamp&lt;span &gt;(&lt;/span&gt;&lt;span &gt;)&lt;/span&gt; &lt;span &gt;INTO&lt;/span&gt; tx_str&lt;span &gt;;&lt;/span&gt;
    RAISE NOTICE &lt;span &gt;&apos;After 2nd CREATE TABLE: % clock, % xact&apos;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; clock_str&lt;span &gt;,&lt;/span&gt; tx_str&lt;span &gt;;&lt;/span&gt;
  &lt;span &gt;ROLLBACK&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;END&lt;/span&gt; $$&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And then call the procedure, we see the following:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;=# CALL my_table_task();
NOTICE:  00000: After 1st CREATE TABLE: 2018-10-03 22:17:26 clock, 2018-10-03 22:17:26 xact
NOTICE:  00000: After 2nd CREATE TABLE: 2018-10-03 22:17:31 clock, 2018-10-03 22:17:26 xact
CALL
Time: 5022.598 ms&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Despite there being two transactions in the procedure, the transaction start timestamp is that of when the procedure got called, not when the embedded transaction actually started.&lt;/p&gt;
&lt;p&gt;You will see the same problem with the &lt;code &gt;xact_start&lt;/code&gt; field in &lt;code &gt;pg_stat_activity&lt;/code&gt;, causing monitoring scripts to potentially detect false positives for long running transactions. This issue is &lt;a href=&quot;https://www.postgresql.org/message-id/flat/20180920234040.GC29981%40momjian.us&quot;&gt;currently in discussion&lt;/a&gt; and likely to be changed before the final release.&lt;/p&gt;
&lt;h3 id=&quot;how-often-does-my-stored-procedure-get-called&quot; &gt;&lt;a href=&quot;#how-often-does-my-stored-procedure-get-called&quot; aria-label=&quot;how often does my stored procedure get called permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How often does my stored procedure get called?&lt;/h3&gt;
&lt;p&gt;Now, if you want to monitor the performance of procedures, it gets a bit difficult. Whilst regular functions can be tracked using &lt;code &gt;track_functions = on&lt;/code&gt;, there is no such facility for procedures. You can however track the execution of CALL statements using &lt;code &gt;pg_stat_statements&lt;/code&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; query&lt;span &gt;,&lt;/span&gt; calls&lt;span &gt;,&lt;/span&gt; total_time &lt;span &gt;FROM&lt;/span&gt; pg_stat_statements &lt;span &gt;WHERE&lt;/span&gt; query &lt;span &gt;LIKE&lt;/span&gt; &lt;span &gt;&apos;CALL%&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;┌────────────┬───────┬────────────┐
│   query    │ calls │ total_time │
├────────────┼───────┼────────────┤
│ CALL abc() │     4 │    5.62299 │
└────────────┴───────┴────────────┘&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In addition, when you enable &lt;code &gt;pg_stat_statements.track = all&lt;/code&gt;, queries that are called from within a procedure will be tracked, and made available in &lt;a href=&quot;https://pganalyze.com&quot;&gt;Postgres query performance monitoring tools such as pganalyze&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Postgres 11 is going to be the best Postgres release yet, and we are excited to put it into use.&lt;/p&gt;
&lt;p&gt;Whilst common wisdom is to not upgrade right after a release, we encourage you to try out the new release early, help the community find bugs (just like we did!), and make sure that your performance monitoring systems are ready to handle the new features that were added.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;PS: If this article was useful to you and you want to share it with your peers you can tweet it by clicking &lt;a href=&quot;https://ctt.ac/JbyV9&quot;&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Postgres Log Monitoring 101: Deadlocks, Checkpoint Tuning & Blocked Queries]]></title><description><![CDATA[Those of us who operate production PostgreSQL databases have many jobs to do - and often there isn't enough time
to take a regular look at the Postgres log files. However, often times those logs contain critical details on how new application code is affecting the database due to locking issues, or how certain configuration parameters cause the database to produce I/O spikes. This post highlights three common performance problems you can find by looking at, and automatically filtering your…]]></description><link>https://pganalyze.com/blog/postgresql-log-monitoring-101-deadlocks-checkpoints-blocked-queries</link><guid isPermaLink="false">https://pganalyze.com/blog/postgresql-log-monitoring-101-deadlocks-checkpoints-blocked-queries</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Mon, 12 Feb 2018 00:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Those of us who operate production PostgreSQL databases have many jobs to do - and often there isn&apos;t enough time
to take a regular look at the Postgres log files.&lt;/p&gt;
&lt;p&gt;However, often times those logs contain critical details on how new application code is affecting the database due to locking issues, or how certain configuration parameters cause the database to produce I/O spikes.&lt;/p&gt;
&lt;p&gt;This post highlights three common performance problems you can find by looking at, and automatically filtering your Postgres logs.&lt;/p&gt;
&lt;h2 id=&quot;blocked-queries&quot; &gt;&lt;a href=&quot;#blocked-queries&quot; aria-label=&quot;blocked queries permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Blocked Queries&lt;/h2&gt;
&lt;p&gt;One of the most performance-related log events are blocked queries, due to waiting for locks that another query has taken. On systems that have problems with locks you will often also see very high CPU utilization that can&apos;t be explained.&lt;/p&gt;
&lt;p&gt;First, in order to enable logging of lock waits, set &lt;code &gt;log_lock_waits = on&lt;/code&gt; in your Postgres config. This will emit a log event like the following if a query has been waiting for longer than &lt;code &gt;deadlock_timeout&lt;/code&gt; (default 1s):&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;LOG: process 123 still waiting for ShareLock on transaction 12345678 after 1000.606 ms
STATEMENT: SELECT table WHERE id = 1 FOR UPDATE;
CONTEXT: while updating tuple (1,3) in relation “table”
DETAIL: Process holding the lock: 456. Wait queue: 123.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This tells us that we&apos;re seeing lock contention on updates for &lt;code &gt;table&lt;/code&gt;, as another transaction holds a lock on the same row we&apos;re trying to update. You can often see this caused by complex transactions that hold locks for too long. One frequent anti-pattern in a typical web app is to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Open a transaction&lt;/li&gt;
&lt;li&gt;Update a timestamp field (e.g. &lt;code &gt;updated_at&lt;/code&gt; in Ruby on Rails)&lt;/li&gt;
&lt;li&gt;Make an API call to an external service&lt;/li&gt;
&lt;li&gt;Commit the transaction&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The lock on the row that you updated in Step 2 will be held all the way to 4., which means if the API call takes a few seconds total, you will be holding a lock on that row for that time. If you have any concurrency in your system that affects the same rows, you will see lock contention, and the above lock notice for the queries in Step 2.&lt;/p&gt;
&lt;p&gt;Often you however have to go back to a development or staging system with full query logging, to understand the full context of a transaction thats causing the problem.&lt;/p&gt;
&lt;h2 id=&quot;deadlocks&quot; &gt;&lt;a href=&quot;#deadlocks&quot; aria-label=&quot;deadlocks permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Deadlocks&lt;/h2&gt;
&lt;p&gt;Related to blocked queries, but slightly different, are deadlocks, which result in a cancelled query due to it deadlocking against another query.&lt;/p&gt;
&lt;p&gt;The easiest way to reproduce a deadlock is doing the following:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;--- session 1&lt;/span&gt;
&lt;span &gt;BEGIN&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; &lt;span &gt;table&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;1&lt;/span&gt; &lt;span &gt;FOR&lt;/span&gt; &lt;span &gt;UPDATE&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;

&lt;span &gt;--- session 2&lt;/span&gt;
&lt;span &gt;BEGIN&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; &lt;span &gt;table&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;2&lt;/span&gt; &lt;span &gt;FOR&lt;/span&gt; &lt;span &gt;UPDATE&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; &lt;span &gt;table&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;1&lt;/span&gt; &lt;span &gt;FOR&lt;/span&gt; &lt;span &gt;UPDATE&lt;/span&gt;&lt;span &gt;;&lt;/span&gt; &lt;span &gt;--- this will block waiting for session 1 to finish&lt;/span&gt;

&lt;span &gt;--- session 1&lt;/span&gt;
&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; &lt;span &gt;table&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; id &lt;span &gt;=&lt;/span&gt; &lt;span &gt;2&lt;/span&gt; &lt;span &gt;FOR&lt;/span&gt; &lt;span &gt;UPDATE&lt;/span&gt;&lt;span &gt;;&lt;/span&gt; &lt;span &gt;--- this can never finish as it deadlocks against session 2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Again after &lt;code &gt;deadlock_timeout&lt;/code&gt; Postgres will see the locking problem. In this case it decides that this will never finish, and emit the following to the logs:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;2018-02-12 09:24:52.176 UTC [3098] ERROR:  deadlock detected
2018-02-12 09:24:52.176 UTC [3098] DETAIL:  Process 3098 waits for ShareLock on transaction 219201; blocked by process 3099.
	Process 3099 waits for ShareLock on transaction 219200; blocked by process 3098.
	Process 3098: SELECT * FROM table WHERE id = 2 FOR UPDATE;
	Process 3099: SELECT * FROM table WHERE id = 1 FOR UPDATE;
2018-02-12 09:24:52.176 UTC [3098] HINT:  See server log for query details.
2018-02-12 09:24:52.176 UTC [3098] CONTEXT:  while locking tuple (0,1) in relation &amp;quot;table&amp;quot;
2018-02-12 09:24:52.176 UTC [3098] STATEMENT:  SELECT * FROM table WHERE id = 2 FOR UPDATE;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You might think that deadlocks never happen in production, but the unfortunate truth is that heavy use of ORM frameworks can hide the circular dependency situation that produces deadlocks, and its certainly something to watch out for when you make use of complex transactions.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/monitoring-postgres-logs&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: The Top 6 Postgres Log Events To Monitor&quot;
        title=&quot;Download Free eBook: The Top 6 Postgres Log Events To Monitor&quot;
        src=&quot;https://pganalyze.com/static/d5520b49175a81a398bfb64c836919c5/acb04/ebook_promo_log_events.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;checkpoints&quot; &gt;&lt;a href=&quot;#checkpoints&quot; aria-label=&quot;checkpoints permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Checkpoints&lt;/h2&gt;
&lt;p&gt;Last but not least, checkpoints. For those unfamiliar, checkpointing is the mechanism by which PostgreSQL persists all changes to the data directory, which before were only in shared buffers and the WAL. Its what gives you a consistent copy of your data in one place (the data directory).&lt;/p&gt;
&lt;p&gt;Due to the fact that checkpoints have to write out all the changes you&apos;ve submitted to the database (which before were already written to the WAL), they can produce quite a lot of I/O - in particular when you are actively loading data.&lt;/p&gt;
&lt;p&gt;The easiest way to produce a checkpoint is to call &lt;code &gt;CHECKPOINT&lt;/code&gt;, but very few people would do that frequently in production. Instead Postgres has a mechanism that automatically triggers a checkpoint, most commonly due to either &lt;code &gt;time&lt;/code&gt;, or &lt;code &gt;xlog&lt;/code&gt;. After turning on &lt;code &gt;log_checkpoints = 1&lt;/code&gt; you can see this in the logs like this:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;Feb 09 08:30:07am PST 12772 LOG: checkpoint starting: time
Feb 09 08:15:50am PST 12772 LOG: checkpoint starting: xlog
Feb 09 08:10:39am PST 12772 LOG: checkpoint starting: xlog&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or when visualized over time, it can look like this:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/6bb89bdedc1b3b85d87d722a9985a14a/58354/checkpoint_starting.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Log Insights: Checkpoint Starting analysis&quot;
        title=&quot;Log Insights: Checkpoint Starting analysis&quot;
        src=&quot;https://pganalyze.com/static/6bb89bdedc1b3b85d87d722a9985a14a/1d69c/checkpoint_starting.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Occasionally Postgres will also output the following warning, which hints at the tuning you can do:&lt;/p&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;Feb 09 10:21:11am PST 5677 LOG: checkpoints are occurring too frequently (17 seconds apart)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With checkpoints you want to avoid having them occur to frequently, as each checkpoint will produce significant I/O, as well as cause all changes that are written to WAL right after to be written as a &lt;a href=&quot;https://www.postgresql.org/docs/10/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES&quot;&gt;full-page write&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Ideally you would see checkpoints spaced out evenly and usually started by &lt;code &gt;time&lt;/code&gt; instead of &lt;code &gt;xlog&lt;/code&gt;. You can influence this behavior by the following config settings:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code &gt;checkpoint_timeout&lt;/code&gt; - the time after which a &lt;code &gt;time&lt;/code&gt; checkpoint will be kicked off (defaults to every 5 minutes)&lt;/li&gt;
&lt;li&gt;&lt;code &gt;max_wal_size&lt;/code&gt; - the maximum amount of WAL that will be accumulated before an &lt;code &gt;xlog&lt;/code&gt; checkpoint gets triggered (defaults to 1 GB)&lt;/li&gt;
&lt;li&gt;&lt;code &gt;checkpoint_completion_target&lt;/code&gt; - how quickly a checkpoint finishes (defaults to &lt;code &gt;0.5&lt;/code&gt; which means it will finish in half the time of &lt;code &gt;checkpoint_timeout&lt;/code&gt;, i.e. 2.5 minutes)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On many production systems I&apos;ve seen &lt;code &gt;max_wal_size&lt;/code&gt; be increased to support higher write rates, &lt;code &gt;checkpoint_timeout&lt;/code&gt; to be slightly increased as well to avoid too frequent time-based checkpoints, as well as setting &lt;code &gt;checkpoint_completion_target&lt;/code&gt; to &lt;code &gt;0.9&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You should however tune all of this based on your own system, and the logs, so you can choose whats correct for your setup. Also note that less frequent checkpoints mean recovery of the server is going to take longer, as Postgres will have to replay all WAL, starting from the previous checkpoint, when booting after a crash.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot; &gt;&lt;a href=&quot;#conclusion&quot; aria-label=&quot;conclusion permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Postgres log files contain a treasure of useful data you can analyze in order to make your system behave faster, as well as debug production issues. This data is readily available, but often difficult to parse.&lt;/p&gt;
&lt;p&gt;This article tries to point the way towards which log lines are worth filtering for on production systems.&lt;/p&gt;
&lt;p&gt;If you don&apos;t want to bother with setting up your own filters in a third party logging system, try out &lt;a href=&quot;https://pganalyze.com/blog/postgres-log-monitoring-with-pganalyze/&quot;&gt;pganalyze Postgres Log Insights&lt;/a&gt;: a real-time PostgreSQL log analysis and log monitoring system built into pganalyze.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Visualizing & Tuning Postgres Autovacuum]]></title><description><![CDATA[In this post we'll take a deep dive into one of the mysteries of PostgreSQL: VACUUM and autovacuum. The Postgres autovacuum logic can be tricky to understand and tune - it has many moving parts,
and is hard to understand, in particular for application developers who don't spend
all day looking at database documentation. But luckily there are recent improvements in Postgres, in particular the addition of
pg_stat_progress_vacuum
in Postgres 9.6, that make understanding autovacuum and VACUUM…]]></description><link>https://pganalyze.com/blog/visualizing-and-tuning-postgres-autovacuum</link><guid isPermaLink="false">https://pganalyze.com/blog/visualizing-and-tuning-postgres-autovacuum</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Tue, 28 Nov 2017 00:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/939f208fc12f8c026ee0fb8e800af11c/5df5d/timeline_short.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;VACUUM timeline visualization&quot;
        title=&quot;VACUUM timeline visualization&quot;
        src=&quot;https://pganalyze.com/static/939f208fc12f8c026ee0fb8e800af11c/1d69c/timeline_short.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;In this post we&apos;ll take a deep dive into one of the mysteries of PostgreSQL: VACUUM and autovacuum.&lt;/p&gt;
&lt;p&gt;The Postgres autovacuum logic can be tricky to understand and tune - it has many moving parts,
and is hard to understand, in particular for application developers who don&apos;t spend
all day looking at database documentation.&lt;/p&gt;
&lt;p&gt;But luckily there are recent improvements in Postgres, in particular the addition of
&lt;a href=&quot;https://www.postgresql.org/docs/10/static/progress-reporting.html&quot;&gt;pg_stat_progress_vacuum&lt;/a&gt;
in Postgres 9.6, that make understanding autovacuum and VACUUM
behavior a bit easier.&lt;/p&gt;
&lt;p&gt;In this post we describe an approach to autovacuum tuning that is based on sampling
these statistics over time, visualizing them, and then making tuning decisions based on data.
The visualizations shown are all screenshots of real data, and are available for
early access in pganalyze.&lt;/p&gt;
&lt;h2 id=&quot;why-vacuum&quot; &gt;&lt;a href=&quot;#why-vacuum&quot; aria-label=&quot;why vacuum permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why VACUUM?&lt;/h2&gt;
&lt;p&gt;First of all, why we need VACUUM, 101:&lt;/p&gt;
&lt;p&gt;When you perform UPDATE and DELETE operations on a table in Postgres,
the database has to keep around the old row data for concurrently running queries and transactions,
due to its MVCC model. Once all concurrent transactions that have seen these old rows have finished,
they effectively become dead rows which will need to be removed.&lt;/p&gt;
&lt;p&gt;VACUUM is the process by which PostgreSQL cleans up these dead rows, and turns the space they have
occupied into usable space again, to be used for future writes.&lt;/p&gt;
&lt;p&gt;A more detailed description can be found in the &lt;a href=&quot;https://www.postgresql.org/docs/10/static/routine-vacuuming.html&quot;&gt;PostgreSQL documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;which-tables-have-vacuum-running&quot; &gt;&lt;a href=&quot;#which-tables-have-vacuum-running&quot; aria-label=&quot;which tables have vacuum running permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Which tables have VACUUM running?&lt;/h2&gt;
&lt;p&gt;The easiest thing you can check on a running PostgreSQL system is which VACUUM
operations are running right now. In all Postgres versions this information shows up in the &lt;code &gt;pg_stat_activity&lt;/code&gt; view,
look for query values that start with &quot;autovacuum: &quot;, or which contain the word &quot;VACUUM&quot;:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; pid&lt;span &gt;,&lt;/span&gt; query &lt;span &gt;FROM&lt;/span&gt; pg_stat_activity &lt;span &gt;WHERE&lt;/span&gt; query &lt;span &gt;LIKE&lt;/span&gt; &lt;span &gt;&apos;autovacuum: %&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;-------+----------------------------------------------------------------------------
 10469 | autovacuum: VACUUM ANALYZE public.schema_columns
 12848 | autovacuum: VACUUM public.replication_follower_stats (to prevent wraparound)
 28626 | autovacuum: VACUUM public.schema_index_stats (to prevent wraparound)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Based on sampling this data, we can generate a timeline view that helps us distinguish
tables that are frequently vacuumed, from tables that have long running vacuums, to
tables that don&apos;t get vacuumed much at all.&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/0b80ae48b95f283af7b2d8c9a80c4049/7970d/timeline.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;VACUUM timeline visualization with details&quot;
        title=&quot;VACUUM timeline visualization with details&quot;
        src=&quot;https://pganalyze.com/static/0b80ae48b95f283af7b2d8c9a80c4049/1d69c/timeline.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;In the screenshot you can see the top 10 tables (by frequency) colored the same way,
and in particular the table thats colored light yellow stand out as effectively
having VACUUM running continuously.&lt;/p&gt;
&lt;p&gt;We can also see that one manual VACUUM was started by the DBA user (colored in cyan),
and that it ran much quicker than the same colored version started by autovacuum
earlier in the day.&lt;/p&gt;
&lt;h2 id=&quot;when-does-autovacuum-run&quot; &gt;&lt;a href=&quot;#when-does-autovacuum-run&quot; aria-label=&quot;when does autovacuum run permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;When does autovacuum run?&lt;/h2&gt;
&lt;p&gt;Another question that frequently comes up is, why did autovacuum decide to start
VACUUMing a table?&lt;/p&gt;
&lt;p&gt;There are essentially two major reasons:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1) To prevent Transaction ID wraparound&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The number of non-frozen transaction IDs has reached &quot;autovacuum_freeze_max_age&quot;
(default 200 million transactions), and VACUUM is required to prevent
transaction ID wraparound.&lt;/p&gt;
&lt;p&gt;We won&apos;t go too much into detail on tuning this parameter in this post, but rather reserve this as a
follow-on topic.&lt;/p&gt;
&lt;p&gt;Note that this can&apos;t be disabled, so it will cause autovacuum to start VACUUM,
even if it is otherwise disabled. If you keep cancelling autovacuum processes
started for this reason you will eventually have to perform a manual VACUUM,
as Postgres will shut down the database otherwise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2) To mark dead rows &amp;#x26; enable re-use for new data&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As you run UPDATEs and DELETEs, dead rows will accumulate, as described earlier
in the post. Once the number of dead rows (or tuples) has exceeded the threshold,
autovacuum will start a VACUUM run.&lt;/p&gt;
&lt;p&gt;The following formula is used to decide whether vacuuming is needed:&lt;/p&gt;
&lt;div  data-language=&quot;text&quot;&gt;&lt;pre &gt;&lt;code &gt;vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;By default the base threshold is 50 rows, and the scale factor is 20%. That means,
a table will be vacuumed as soon as the number of dead rows exceeds 20% of all
rows in the table, given that at least 50 rows are marked as dead.&lt;/p&gt;
&lt;p&gt;In order to understand when this gets triggered, you can look at the &lt;code &gt;n_live_tup&lt;/code&gt; and &lt;code &gt;n_dead_tup&lt;/code&gt;
values in &lt;a href=&quot;https://www.postgresql.org/docs/10/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW&quot;&gt;pg_stat_user_tables&lt;/a&gt;:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; pg_stat_user_tables &lt;span &gt;WHERE&lt;/span&gt; relname &lt;span &gt;=&lt;/span&gt; &lt;span &gt;&apos;backend_states&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;-[ RECORD 1 ]-------+------------------------------
relid               | 732156523
schemaname          | public
relname             | backend_states
...
n_live_tup          | 23047184
n_dead_tup          | 108373
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can then take this information, together with the autovacuum settings, and visualize it:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/043b1a8340aa3f989198da109eaacd97/0f882/vacuum_table.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;VACUUM table&quot;
        title=&quot;VACUUM table&quot;
        src=&quot;https://pganalyze.com/static/043b1a8340aa3f989198da109eaacd97/1d69c/vacuum_table.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Here you can see that as soon as the dead tuples (grey/red area) reach the threshold (grey line),
a VACUUM process kicks off (red line in the lower graph).&lt;/p&gt;
&lt;p&gt;On a table that can&apos;t keep up with VACUUM, which results in bloat due to dead rows,
this would instead look like this:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/59e3ed24aa959efa2876311d1a2f64f2/e4ba2/vacuum_table_frequent.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;VACUUM table alternative&quot;
        title=&quot;VACUUM table alternative&quot;
        src=&quot;https://pganalyze.com/static/59e3ed24aa959efa2876311d1a2f64f2/1d69c/vacuum_table_frequent.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;h2 id=&quot;how-fast-does-autovacuum-run&quot; &gt;&lt;a href=&quot;#how-fast-does-autovacuum-run&quot; aria-label=&quot;how fast does autovacuum run permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How fast does autovacuum run?&lt;/h2&gt;
&lt;p&gt;A VACUUM process that was started by autovacuum is artificially throttled in the default
PostgreSQL configuration, so it doesn&apos;t fully utilize the CPU and I/O available.&lt;/p&gt;
&lt;p&gt;That is the correct way to operate for most systems, as you wouldn&apos;t want VACUUM to
slow down application queries during business hours.&lt;/p&gt;
&lt;p&gt;The system that Postgres follows for this is that every VACUUM operation accumulates
cost, which you can think of as points that get added up:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/9.6/static/runtime-config-resource.html#GUC-VACUUM-COST-PAGE-HIT&quot;&gt;vacuum_cost_page_hit&lt;/a&gt; (cost for vacuuming a page found in the buffer cache, default 1)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/9.6/static/runtime-config-resource.html#GUC-VACUUM-COST-PAGE-MISS&quot;&gt;vacuum_cost_page_miss&lt;/a&gt; (cost for vacuuming a page retrieved from disk, default 10)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/9.6/static/runtime-config-resource.html#GUC-VACUUM-COST-PAGE-DIRTY&quot;&gt;vacuum_cost_page_dirty&lt;/a&gt; (cost for writing back a modified page to disk, default 20)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once the sum of costs has reached autovacuum_cost_limit (default 200 for autovacuum, disabled for manual VACUUM),
the VACUUM process will sleep and do nothing for autovacuum_vacuum_cost_delay (default 20 ms).&lt;/p&gt;
&lt;p&gt;With the default parameters, that means that autovacuum will at most write 4MB/s to disk, and read 8MB/s from disk or the OS page cache.&lt;/p&gt;
&lt;p&gt;&lt;a src=&quot;https://pganalyze.com/ebooks/optimizing-postgres-query-performance&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        title=&quot;Download Free eBook: How To Get 3x Faster Postgres&quot;
        src=&quot;https://pganalyze.com/static/c15d0b3082bebd2680b86cc948555f76/acb04/ebook_promo_query_performance.jpg&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;how-far-has-this-vacuum-made-progress&quot; &gt;&lt;a href=&quot;#how-far-has-this-vacuum-made-progress&quot; aria-label=&quot;how far has this vacuum made progress permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How far has this VACUUM made progress?&lt;/h2&gt;
&lt;p&gt;VACUUM runs through three different major phases as part of its operation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scanning Heap&lt;/li&gt;
&lt;li&gt;Vacuuming Indices&lt;/li&gt;
&lt;li&gt;Vacuuming Heap&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As well as a few &lt;a href=&quot;https://www.postgresql.org/docs/10/static/progress-reporting.html#VACUUM-PHASES&quot;&gt;minor phases&lt;/a&gt; that are usually really quick.&lt;/p&gt;
&lt;p&gt;The &quot;Vacuuming Indices&quot; and &quot;Vacuuming Heap&quot; phase might run multiple times if the
&lt;code &gt;autovacuum_work_mem&lt;/code&gt; setting is set to a too low value that not all dead tuples
can be held in memory.&lt;/p&gt;
&lt;p&gt;Based on sampling &lt;a href=&quot;https://www.postgresql.org/docs/10/static/progress-reporting.html&quot;&gt;pg_stat_progress_vacuum&lt;/a&gt; we can visualize in detail what goes on:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/c799301db2bace33f8bc62383323647b/1acf3/vacuum_detail.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;VACUUM details&quot;
        title=&quot;VACUUM details&quot;
        src=&quot;https://pganalyze.com/static/c799301db2bace33f8bc62383323647b/1d69c/vacuum_detail.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;This works even whilst a autovacuum or manual VACUUM is still running, and so we can
get a visual indication of how long we will roughly have to wait for it to finish.&lt;/p&gt;
&lt;h2 id=&quot;what-should-i-tune-first&quot; &gt;&lt;a href=&quot;#what-should-i-tune-first&quot; aria-label=&quot;what should i tune first permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What should I tune first?&lt;/h2&gt;
&lt;p&gt;In general, one might think that VACUUM is an expensive operation, and you&apos;d want
to only run it infrequently, maybe even as a nightly maintenance task.&lt;/p&gt;
&lt;p&gt;That however is often the wrong way to approach it, as rarely run VACUUMs are much
more expensive since they have more work to do, and it also means your system
will spend more time in a sub-optimal state.&lt;/p&gt;
&lt;p&gt;Instead, try to have VACUUM run more often, in proportion to UPDATEs and DELETEs your
application performs. Frequently run VACUUMs will be faster, as there is less work to perform.&lt;/p&gt;
&lt;p&gt;There is two primary tunings you should consider on production Postgres databases:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1) Lower autovacuum_vacuum_scale_factor on tables with old, inactive data&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For tables with a lot of old, inactive data, consider lowering the threshold by
which autovacuum is triggered. Since the calculation is based on the number of
total rows in the table, autovacuum will not notice if most recent rows have been
modified, since the overall number of dead rows will still be way below the default
threshold of 20%.&lt;/p&gt;
&lt;p&gt;However, you will see the impact of dead rows on your query performance, as the
dead rows have to be scanned over when reading data. Reducing the scale factor to
keep down the total number of dead rows can make sense in such cases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2) Adjust autovacuum_cost_limit / autovacuum_cost_delay for bigger machines&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The default settings for throttling are quite conservative on modern systems. Unless
you run on the smallest instance type, or with the cheapest storage, it often makes sense
to speed up autovacuum a bit.&lt;/p&gt;
&lt;p&gt;In addition, for small tables that have a lot of updates/deletes, it can happen that autovacuum is not
able to keep up, and that you will see new VACUUMs start pretty much right after
the previous one was finished. In such cases adjusting the throttling on a per-table basis
might also make sense.&lt;/p&gt;
&lt;p&gt;Note that most autovacuum configuration
settings can be overridden on a per-table basis:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;ALTER&lt;/span&gt; &lt;span &gt;TABLE&lt;/span&gt; my_table &lt;span &gt;SET&lt;/span&gt; &lt;span &gt;(&lt;/span&gt;autovacuum_vacuum_scale_factor &lt;span &gt;=&lt;/span&gt; &lt;span &gt;0.05&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It often makes sense to review that table&apos;s particular statistics, e.g. how often is
the table updated and how many dead tuples does it accumulate, before modifying
autovacuum settings.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The visualizations shown in this post are based on real data, and are now available
for early access to all pganalyze customers on the Scale plan and higher.&lt;/p&gt;
&lt;p&gt;Reach out to have this feature enabled for your account - we&apos;d be happy to walk you
through it, and help you tune autovacuum on your database.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Whats New in Postgres 10: Monitoring Improvements]]></title><description><![CDATA[Postgres 10 has been stamped on Monday, and will most likely be released this week, so this seems like a good time
to review what this new release brings in terms of Monitoring functionality built into the database. In this post you'll see a few things that we find exciting about the new release, as well as
some tips on what to adjust, whether you use a hosted Postgres monitoring tool like pganalyze,
or if you've written your own scripts. New "pg_monitor" Monitoring Role Most users of Postgres…]]></description><link>https://pganalyze.com/blog/whats-new-in-postgres-10-monitoring-improvements</link><guid isPermaLink="false">https://pganalyze.com/blog/whats-new-in-postgres-10-monitoring-improvements</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Wed, 04 Oct 2017 00:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;Postgres 10 has been stamped on Monday, and will most likely be released this week, so this seems like a good time
to review what this new release brings in terms of Monitoring functionality built into the database.&lt;/p&gt;
&lt;p&gt;In this post you&apos;ll see a few things that we find exciting about the new release, as well as
some tips on what to adjust, whether you use a hosted Postgres monitoring tool like pganalyze,
or if you&apos;ve written your own scripts.&lt;/p&gt;
&lt;h2 id=&quot;new-pg_monitor-monitoring-role&quot; &gt;&lt;a href=&quot;#new-pg_monitor-monitoring-role&quot; aria-label=&quot;new pg_monitor monitoring role permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;New &quot;pg_monitor&quot; Monitoring Role&lt;/h2&gt;
&lt;p&gt;Most users of Postgres obviously don&apos;t want to give monitoring tools access to superuser, but in
the past this was often required, as many Postgres statistic views (e.g. pg_stat_statements)
only show the values for the current user, unless you are superuser.&lt;/p&gt;
&lt;p&gt;This meant that you had to workaround with &lt;code &gt;SECURITY DEFINER&lt;/code&gt; functions that queries
the statistics views as superuser, but could be called from a restricted user.&lt;/p&gt;
&lt;p&gt;Now, you can use the monitoring role in Postgres 10 to instead give a user specific
access to monitor statistics views, without giving out any other access.&lt;/p&gt;
&lt;p&gt;Its as simple as:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;GRANT&lt;/span&gt; pg_monitor &lt;span &gt;TO&lt;/span&gt; monitoring_user&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And afterwards that user can simply access statistics views without running into &lt;code &gt;&amp;lt;insufficient privilege&gt;&lt;/code&gt; issues like before.&lt;/p&gt;
&lt;p&gt;This also works with pganalyze out of the box, so once you upgrade to 10 you can
simply grant the monitoring role to the pganalyze user, and drop the helper
functions we&apos;ve previously asked you to create.&lt;/p&gt;
&lt;p&gt;A subset of often used views that the monitoring role now grants you access to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;pg_stat_statements&lt;/li&gt;
&lt;li&gt;pg_stat_activity&lt;/li&gt;
&lt;li&gt;pg_stat_replication&lt;/li&gt;
&lt;li&gt;pg_stat_progress_vacuum&lt;/li&gt;
&lt;li&gt;.. &lt;a href=&quot;https://www.postgresql.org/docs/10/static/monitoring-stats.html&quot;&gt;and more&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that there more &lt;a href=&quot;https://www.postgresql.org/docs/10/static/default-roles.html&quot;&gt;fine-grained roles&lt;/a&gt; you can assign, should you want to.&lt;/p&gt;
&lt;h2 id=&quot;renaming-of-xlog-to-wal-and-location-to-lsn&quot; &gt;&lt;a href=&quot;#renaming-of-xlog-to-wal-and-location-to-lsn&quot; aria-label=&quot;renaming of xlog to wal and location to lsn permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Renaming of &quot;xlog&quot; to &quot;wal&quot;, and &quot;location&quot; to &quot;lsn&quot;&lt;/h2&gt;
&lt;p&gt;If you&apos;ve written your own monitoring scripts to check replication lag, and other
statistics that have to do with WAL or LSNs, you&apos;ll need to update some function names.&lt;/p&gt;
&lt;p&gt;In this new release, besides the WAL directory being renamed from &quot;pg_xlog&quot; to &quot;pg_wal&quot;,
all system administration functions have also been renamed to match this change. In addition,
where previously functions had the name &quot;location&quot; in them, it now refers to &quot;lsn&quot;.&lt;/p&gt;
&lt;p&gt;You are most likely going to run into this with the often used &lt;code &gt;pg_current_xlog_location&lt;/code&gt; (now &lt;code &gt;pg_current_wal_lsn&lt;/code&gt;), as well as the helper method &lt;code &gt;pg_xlog_location_diff&lt;/code&gt; (now &lt;code &gt;pg_wal_lsn_diff&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Also note that the &lt;code &gt;sent_location&lt;/code&gt;, &lt;code &gt;write_location&lt;/code&gt;, etc fields in &lt;code &gt;pg_stat_replication&lt;/code&gt; have been renamed to &lt;code &gt;sent_lsn&lt;/code&gt;, &lt;code &gt;write_lsn&lt;/code&gt; and so forth.&lt;/p&gt;
&lt;h2 id=&quot;wait-events--non-client-connections-in-pg_stat_activity&quot; &gt;&lt;a href=&quot;#wait-events--non-client-connections-in-pg_stat_activity&quot; aria-label=&quot;wait events  non client connections in pg_stat_activity permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Wait Events &amp;#x26; Non-Client Connections in pg_stat_activity&lt;/h2&gt;
&lt;p&gt;The &lt;code &gt;pg_stat_activity&lt;/code&gt; view and underlying data structure has been thoroughly improved this release, and now shows not just client connections and autovacuum, but also other background workers that are running in the system:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; pid&lt;span &gt;,&lt;/span&gt; backend_type&lt;span &gt;,&lt;/span&gt; backend_start &lt;span &gt;FROM&lt;/span&gt; pg_stat_activity &lt;span &gt;WHERE&lt;/span&gt; backend_type &lt;span &gt;!=&lt;/span&gt; &lt;span &gt;&apos;client backend&apos;&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt; pid |    backend_type     |         backend_start         
-----+---------------------+-------------------------------
  58 | autovacuum launcher | 2017-10-03 21:02:45.458053+00
  60 | background worker   | 2017-10-03 21:02:45.459172+00
  56 | background writer   | 2017-10-03 21:02:45.457657+00
  55 | checkpointer        | 2017-10-03 21:02:45.457491+00
  57 | walwriter           | 2017-10-03 21:02:45.457817+00&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you have previously written monitoring scripts that rely on counting the number of entries in pg_stat_activity, you should filter the view by &lt;code &gt;backend_type = &apos;client backend&apos;&lt;/code&gt;, or switch to using &lt;code &gt;numbackends&lt;/code&gt; from &lt;code &gt;pg_stat_database&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In addition to this, the new release also brings an additional 115 wait events (visible in &lt;code &gt;wait_event_type&lt;/code&gt; and &lt;code &gt;wait_event&lt;/code&gt; in &lt;code &gt;pg_stat_activity&lt;/code&gt;), in particular more than 60 new I/O related events which help you understand better what a query is busy with.&lt;/p&gt;
&lt;p&gt;You can find the full list of wait events in the &lt;a href=&quot;https://www.postgresql.org/docs/10/static/monitoring-stats.html#wait-event-table&quot;&gt;Postgres documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;amcheck&quot; &gt;&lt;a href=&quot;#amcheck&quot; aria-label=&quot;amcheck permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;amcheck&lt;/h2&gt;
&lt;p&gt;Last but not least, a useful feature for consistency checking got added in this release. Initially developed by Peter Geoghegan and battle-tested at Heroku Postgres, this new tool allows you to check a B-Tree index for corruption as well as verify that invariants in the structure of the index are as expected.&lt;/p&gt;
&lt;p&gt;It first needs to be created as &lt;code &gt;CREATE EXTENSION amcheck&lt;/code&gt; and can then be run by a superuser like this:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; bt_index_check&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&apos;my_test_index&apos;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;&lt;span &gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt; bt_index_check
----------------
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;An empty result indicates that the index is consistent, as would be expected.&lt;/p&gt;
&lt;p&gt;Note that amcheck accesses the index through the shared buffer cache, so it might not show problems at the disk level right away. See more details on its &lt;a href=&quot;https://www.postgresql.org/docs/10/static/amcheck.html&quot;&gt;documentation page&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This concludes a short overview of new monitoring functionality in Postgres 10.&lt;/p&gt;
&lt;p&gt;Note that there are many other amazing new features like parallel query, logical replication and declarative partitioning that are not covered in this post.&lt;/p&gt;
&lt;p&gt;If this article proved useful to you, you might also be interested in our &lt;a href=&quot;https://pganalyze.com/blog/postgresql-log-monitoring-101-deadlocks-checkpoints-blocked-queries&quot;&gt;Postgres Log Monitoring 101&lt;/a&gt; article where we take a closer look at Deadlocks, Checkpoint Tuning, and Blocked Queries.&lt;/p&gt; ]]&gt;</content:encoded></item><item><title><![CDATA[Introducing pg_query: Parse PostgreSQL queries in Ruby]]></title><description><![CDATA[In this article we'll take a look at the new pg_query Ruby library. pg_query is a Ruby library I wrote to help you parse SQL queries and work with the PostgreSQL parse tree. We use this extension inside pganalyze to provide contextual information for each query and find columns which might need an index. At the end of this article you'll also find monitor.rb - a ready-to-use example that filters pg_stat_statements output and restricts it to only show a specific table. Existing Solutions to Parse…]]></description><link>https://pganalyze.com/blog/parse-postgresql-queries-in-ruby</link><guid isPermaLink="false">https://pganalyze.com/blog/parse-postgresql-queries-in-ruby</guid><dc:creator><![CDATA[Lukas Fittl]]></dc:creator><pubDate>Tue, 17 Jun 2014 00:00:00 GMT</pubDate><content:encoded>&lt;![CDATA[ &lt;p&gt;In this article we&apos;ll take a look at the new &lt;strong&gt;&lt;a href=&quot;https://github.com/pganalyze/pg_query&quot;&gt;pg_query&lt;/a&gt;&lt;/strong&gt; Ruby library.&lt;/p&gt;
&lt;p&gt;pg_query is a Ruby library I wrote to help you parse SQL queries and work with the PostgreSQL parse tree. We use this extension inside &lt;a href=&quot;https://pganalyze.com&quot;&gt;pganalyze&lt;/a&gt; to provide contextual information for each query and find columns which might need an index.&lt;/p&gt;
&lt;p&gt;At the end of this article you&apos;ll also find &lt;strong&gt;&lt;a href=&quot;https://gist.github.com/lfittl/301542602607b738b23f&quot;&gt;monitor.rb&lt;/a&gt;&lt;/strong&gt; - a ready-to-use example that filters pg_stat_statements output and restricts it to only show a specific table.&lt;/p&gt;
&lt;h2 id=&quot;existing-solutions-to-parse-sql-queries&quot; &gt;&lt;a href=&quot;#existing-solutions-to-parse-sql-queries&quot; aria-label=&quot;existing solutions to parse sql queries permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Existing Solutions to Parse SQL Queries&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://xkcd.com/208/&quot;&gt;&lt;span
      
      
    &gt;
      &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;xckd comic on regular expressions&quot;
        title=&quot;xckd comic on regular expressions&quot;
        src=&quot;https://pganalyze.com/static/e6b0aa1e4ff445198ecb4cef11709213/bc962/xkcd_regexp.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
    &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;After a longer period of research on this problem, we&apos;ve come to a few realizations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Obviously, using regular expressions for parsing any complex language is &lt;a href=&quot;http://stackoverflow.com/a/1732454&quot;&gt;a bad idea&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;None of the existing parsers work really well, or are maintained. For example &lt;a href=&quot;https://github.com/andialbrecht/sqlparse&quot;&gt;sqlparse&lt;/a&gt; is focused on re-indenting and beautifying SQL - not for actually working with the query.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Writing and maintaining our own SQL parser is a bad idea. SQL is complex, even for simple things like &lt;a href=&quot;http://www.postgresql.org/docs/current/static/sql-select.html&quot;&gt;SELECT&lt;/a&gt;. And don&apos;t get me started on Common Table Expressions, sub-queries and other fun features.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Our conclusion:&lt;/strong&gt; The only way to correctly parse all valid SQL queries that PostgreSQL understands, now and in the future, is to use PostgreSQL itself.&lt;/p&gt;
&lt;p&gt;And in general, PostgreSQL turns out to have a pretty good SQL parser - other SQL databases &lt;a href=&quot;https://www.youtube.com/watch?v=ZvmMzI0X7fE#t=4m15s&quot;&gt;even use it as a reference implementation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So we&apos;ve pretty much determined that we wanted to use the PostgreSQL parser itself - but how do we access it?&lt;/p&gt;
&lt;h2 id=&quot;accessing-the-postgresql-parser&quot; &gt;&lt;a href=&quot;#accessing-the-postgresql-parser&quot; aria-label=&quot;accessing the postgresql parser permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Accessing the PostgreSQL Parser&lt;/h2&gt;
&lt;p&gt;Lets get the PostgreSQL server source, go down the rabbit hole and find what we need:&lt;/p&gt;
&lt;div  data-language=&quot;c&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;/*
 * raw_parser
 * Given a query in string form, do lexical
 * and grammatical analysis.
 *
 * Returns a list of raw (un-analyzed) parse trees.
 */&lt;/span&gt;
List &lt;span &gt;*&lt;/span&gt;
&lt;span &gt;raw_parser&lt;/span&gt;&lt;span &gt;(&lt;/span&gt;&lt;span &gt;const&lt;/span&gt; &lt;span &gt;char&lt;/span&gt; &lt;span &gt;*&lt;/span&gt;str&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;{&lt;/span&gt;
	&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;
&lt;span &gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is the C function that takes a query and returns a parse tree as C structs.&lt;/p&gt;
&lt;p&gt;Luckily this function is fairly independent, it does not need pg_catalog access (tables, indices, statistics, etc) since it runs before the query is rewritten, planned and executed:&lt;/p&gt;
&lt;p&gt;&lt;span
      
      
    &gt;
      &lt;a
    
    src=&quot;https://pganalyze.com/static/9acf49ec25e6461b6ce43b2e8fd2793b/db783/query_execution.png&quot;
    
    target=&quot;_blank&quot;
    rel=&quot;noopener&quot;
  &gt;
    &lt;span
    
    
  &gt;&lt;/span&gt;
  &lt;img
        
        alt=&quot;Diagram of query execution flow in Postgres&quot;
        title=&quot;Diagram of query execution flow in Postgres&quot;
        src=&quot;https://pganalyze.com/static/9acf49ec25e6461b6ce43b2e8fd2793b/db783/query_execution.png&quot;
        
        
        
        loading=&quot;lazy&quot;
        decoding=&quot;async&quot;
      /&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Unfortunately &lt;a href=&quot;https://github.com/postgres/postgres/blob/0a7832005792fa6dad171f9cadb8d587fe0dd800/src/backend/parser/parser.c#L35&quot;&gt;&lt;code &gt;raw_parser(...)&lt;/code&gt;&lt;/a&gt; is not exposed or included in any of the PostgreSQL libraries - and its quite difficult to extract the parser from PostgreSQL without taking a whole lot of other code with you.&lt;/p&gt;
&lt;p&gt;The pgpool project &lt;a href=&quot;http://git.postgresql.org/gitweb/?p=pgpool2.git;a=blob;f=src/parser/gram.y;hb=HEAD&quot;&gt;has actually done this&lt;/a&gt;, but they do need to update that code for every new major release. We&apos;ve therefore turned to a slightly different approach:&lt;/p&gt;
&lt;p&gt;We use the PostgreSQL server code directly - by &lt;strong&gt;statically linking the code into our own shared library.&lt;/strong&gt; Through a bit of linking magic, we &lt;a href=&quot;https://github.com/pganalyze/pg_query/blob/e80afe63a2ae10695608ab8d53b10cd7beb32124/ext/pg_query/pg_query.c#L33&quot;&gt;simply call the internal parser functions&lt;/a&gt;, and expose that function through a Ruby interface, to be used like this:&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;require&lt;/span&gt; &lt;span &gt;&apos;pg_query&apos;&lt;/span&gt;

pp &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT 1&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;#&amp;lt;PgQuery:0x007f8cdaa8f8b8&lt;/span&gt;
 &lt;span &gt;@parsetree&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;
  &lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;SELECT&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
     &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;distinctClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;intoClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;targetList&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
       &lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;RESTARGET&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;name&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;indirection&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;val&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;A_CONST&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;val&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;1&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;7&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;7&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;fromClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;whereClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;groupClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;havingClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;windowClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;valuesLists&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;sortClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;limitOffset&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;limitCount&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;lockingClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;withClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;op&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;all&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;false&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;larg&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;rarg&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
 &lt;span &gt;@query&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&quot;SELECT 1&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
 &lt;span &gt;@warnings&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The result is a PostgreSQL parse tree as used by PostgreSQL internally.&lt;/p&gt;
&lt;h2 id=&quot;parsing-normalized-queries&quot; &gt;&lt;a href=&quot;#parsing-normalized-queries&quot; aria-label=&quot;parsing normalized queries permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Parsing Normalized Queries&lt;/h2&gt;
&lt;p&gt;Now, to the interesting part. Assume we collect pg_stat_statements queries like this one:&lt;/p&gt;
&lt;div  data-language=&quot;sql&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;SELECT&lt;/span&gt; &lt;span &gt;&quot;users&quot;&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;*&lt;/span&gt; &lt;span &gt;FROM&lt;/span&gt; &lt;span &gt;&quot;users&quot;&lt;/span&gt; &lt;span &gt;WHERE&lt;/span&gt; &lt;span &gt;&quot;users&quot;&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;&lt;span &gt;&quot;id&quot;&lt;/span&gt; &lt;span &gt;=&lt;/span&gt; ?&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that the actual value has been replaced by the &lt;code &gt;?&lt;/code&gt; character. Unfortunately, the PostgreSQL parser can&apos;t parse queries normalized in this manner. It would simply return a syntax error.&lt;/p&gt;
&lt;p&gt;At first, we simply replaced all occurences of &lt;code &gt;?&lt;/code&gt; with &lt;code &gt;$0&lt;/code&gt; (a parameter reference) before parsing, so that the query can be parsed correctly.&lt;/p&gt;
&lt;p&gt;There are however a few problems with that kind of &quot;dumb&quot; string replacement - most prominentely: We&apos;re breaking all operators containing &lt;code &gt;?&lt;/code&gt;, like for example those for &lt;a href=&quot;http://www.postgresql.org/docs/devel/static/functions-json.html&quot;&gt;JSONB&lt;/a&gt; in 9.4.&lt;/p&gt;
&lt;p&gt;Our improved solution to this: &lt;a href=&quot;https://github.com/pganalyze/postgres/compare/REL9_3_STABLE...pg_query?w=1#diff-3&quot;&gt;We&apos;ve patched the PostgreSQL parser&lt;/a&gt; to support &lt;code &gt;?&lt;/code&gt; as a parameter reference (identical with &lt;code &gt;$0&lt;/code&gt;).&lt;/p&gt;
&lt;div  data-language=&quot;ruby&quot;&gt;&lt;pre &gt;&lt;code &gt;&lt;span &gt;require&lt;/span&gt; &lt;span &gt;&apos;pg_query&apos;&lt;/span&gt;

pp &lt;span &gt;PgQuery&lt;/span&gt;&lt;span &gt;.&lt;/span&gt;parse&lt;span &gt;(&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM x WHERE y = ?&quot;&lt;/span&gt;&lt;span &gt;)&lt;/span&gt;
&lt;span &gt;#&amp;lt;PgQuery:0x007f8cdaaaae10&lt;/span&gt;
 &lt;span &gt;@parsetree&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;
  &lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;SELECT&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
     &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;distinctClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;intoClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;targetList&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
       &lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;RESTARGET&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;name&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;indirection&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;val&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;COLUMNREF&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;fields&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;A_STAR&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;7&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;7&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;fromClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
       &lt;span &gt;[&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;RANGEVAR&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
          &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;schemaname&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;relname&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;&quot;x&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;inhOpt&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;2&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;relpersistence&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;&quot;p&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;alias&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
           &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;14&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;whereClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
       &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;AEXPR&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;
         &lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;name&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&quot;=&quot;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;lexpr&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;COLUMNREF&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;fields&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;&quot;y&quot;&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;22&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;rexpr&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;PARAMREF&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;{&lt;/span&gt;&lt;span &gt;&quot;number&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt; &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;26&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
          &lt;span &gt;&quot;location&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;24&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;groupClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;havingClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;windowClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;valuesLists&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;sortClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;limitOffset&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;limitCount&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;lockingClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;withClause&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;op&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;0&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;all&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;false&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;larg&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
      &lt;span &gt;&quot;rarg&quot;&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;span &gt;nil&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;}&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
 &lt;span &gt;@query&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;&quot;SELECT * FROM x WHERE y = ?&quot;&lt;/span&gt;&lt;span &gt;,&lt;/span&gt;
 &lt;span &gt;@warnings&lt;/span&gt;&lt;span &gt;=&lt;/span&gt;&lt;span &gt;[&lt;/span&gt;&lt;span &gt;]&lt;/span&gt;&lt;span &gt;&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Unfortunately, right now, this parser change limits the usage of &lt;code &gt;?&lt;/code&gt; in operators to those in core - specifically JSONB and gemetric operators. If you use third-party extensions or custom operators that contain &lt;code &gt;?&lt;/code&gt;, pg_query likely won&apos;t be able to parse those queries.&lt;/p&gt;
&lt;h2 id=&quot;the-result&quot; &gt;&lt;a href=&quot;#the-result&quot; aria-label=&quot;the result permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The Result&lt;/h2&gt;
&lt;p&gt;As a proof of concept, I wrote &lt;strong&gt;&lt;a href=&quot;https://gist.github.com/lfittl/301542602607b738b23f&quot;&gt;monitor.rb&lt;/a&gt;&lt;/strong&gt;, a Ruby script that  shows the current information stored inside pg_stat_statements in a top-like manner, filtered by a specific table:&lt;/p&gt;
&lt;div  data-language=&quot;shell&quot;&gt;&lt;pre &gt;&lt;code &gt;monitor.rb -d sampledb -t &lt;span &gt;users&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div  data-language=&quot;code&quot;&gt;&lt;pre &gt;&lt;code &gt;AVG     | QUERY
--------------------------------------------------------------------------------
1.5ms   | SELECT &amp;quot;users&amp;quot;.* FROM &amp;quot;users&amp;quot;
0.1ms   | SELECT &amp;quot;users&amp;quot;.* FROM &amp;quot;users&amp;quot; WHERE &amp;quot;users&amp;quot;.&amp;quot;id&amp;quot; = ? ORDER BY &amp;quot;users&amp;quot;.&amp;quot;id&amp;quot; ASC LIMIT ?
0.1ms   | UPDATE &amp;quot;users&amp;quot; SET &amp;quot;fullname&amp;quot; = $1, &amp;quot;updated_at&amp;quot; = $2 WHERE &amp;quot;users&amp;quot;.&amp;quot;id&amp;quot; = ?
0.0ms   | SELECT &amp;quot;users&amp;quot;.* FROM &amp;quot;users&amp;quot; WHERE &amp;quot;users&amp;quot;.&amp;quot;id&amp;quot; = $1 LIMIT 1&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This could be easily extended to highlight queries accessing large tables, potentially missing indices, etc.&lt;/p&gt;
&lt;h2 id=&quot;going-forward&quot; &gt;&lt;a href=&quot;#going-forward&quot; aria-label=&quot;going forward permalink&quot; &gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; height=&quot;16&quot; version=&quot;1.1&quot; viewBox=&quot;0 0 16 16&quot; width=&quot;16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Going Forward&lt;/h2&gt;
&lt;p&gt;As you can see, PostgreSQL parse trees are quite useful - and there are many more analysis/grouping options that could be explored.&lt;/p&gt;
&lt;p&gt;If you enjoyed reading this, please give &lt;a href=&quot;https://github.com/pganalyze/pg_query&quot;&gt;pg_query&lt;/a&gt; a try. Simply install it using:&lt;/p&gt;
&lt;div  data-language=&quot;shell&quot;&gt;&lt;pre &gt;&lt;code &gt;gem &lt;span &gt;install&lt;/span&gt; pg_query&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;During installation of the library a full PostgreSQL server is compiled, so it might take 5-10 minutes. Using a gem cache is advised for deployment.&lt;/p&gt;
&lt;p&gt;Interested in support for other languages? &lt;a href=&quot;mailto:lukas@pganalyze.com&quot;&gt;Drop me a line&lt;/a&gt; and I&apos;d love to chat how we can add support for Python, Perl, you name it.&lt;/p&gt;
&lt;p&gt;Furthermore, we&apos;ll try to get some of our patches upstream for PostgreSQL 9.5 - this specifically relates to our changes in outfuncs.c, supporting additional query nodes and JSON output. Your help and feedback is appreciated.&lt;/p&gt;
&lt;p&gt;And of course, if you build something cool with this, let us know! :)&lt;/p&gt; ]]&gt;</content:encoded></item></channel></rss>