Postgres in Production, Special Series: Deep Dive into pg_stat_statements (Part 4)
02 June, 2026In Part 4 of this special "Postgres in Production" deep dive series, Ryan Booz takes you into the pg_stat_statements source code to show how it decides what to keep when the hash table fills up. There's a per-query counter you never see in the view, eviction sorts every entry on every deallocation, only completed executions get stored at all, and a higher pg_stat_statements.max is not free. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter...
Continue readingPostgres in Production, Special Series: Deep Dive into pg_stat_statements (Part 3)
19 May, 2026In Part 3 of this special "Postgres in Production" deep dive series, Ryan Booz walks through where pg_stat_statements actually stores your query text. Spoiler: it's not in the in-memory hash table you might expect. He covers the file's location on disk, how it grows, the pointer mechanism that ties hash table entries back to your SQL, the quirky consequence that the same query text can appear multiple times, and why ORM-heavy workloads can balloon this file to hundreds of megabytes. Share this...
Continue readingNew in pganalyze: Server Groups and Cluster-Wide Query Performance
13 May, 2026You have a Postgres primary and three read replicas. Two of the replicas are serving slow queries. Where do you start? Until now, pganalyze would have shown you four separate views, one per server. Aggregating the numbers would be a manual process as you flip between tabs, having to remember which queries ran where. When we introduced cluster-aware Index Advisor, we started fixing this fragmented experience. Considering the entire workload before creating index recommendations for a cluster was...
Continue readingPostgres in Production, Special Series: Deep Dive into pg_stat_statements (Part 2)
06 May, 2026In Part 2 of this special "Postgres in Production" deep dive series, Ryan Booz walks through what actually makes an entry in pg_stat_statements "unique", why uniqueness is structural rather than textual, and how a few common patterns (ORMs, dynamic SQL, multi-database setups) can quietly fill up the in-memory hash table and cost you visibility into the queries that matter most. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and...
Continue readingpganalyze MCP Server is now in Public Preview
30 April, 2026Like many of you, we see benefits to use AI tools in our database workflows, but at the same time worry about the risks. Pasting query output into a chat box risks leaking customer data, and pointing an agent at a live production connection can lead to worse. The pganalyze MCP Server takes a different approach: it gives the model curated access to the production database statistics pganalyze already collects, with no direct database access required. Today the pganalyze MCP Server moves from...
Continue readingPostgres in Production, Special Series: Deep Dive into pg_stat_statements (Part 1)
13 April, 2026In this special series of "Postgres in Production", we take a deep dive into pg_stat_statements -- the essential Postgres extension for tracking query performance. In Part 1, Ryan Booz covers what pg_stat_statements is, how to enable it, what it tracks, and critically, what it doesn't track and why that matters for your monitoring setup. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript...
Continue readingWaiting for Postgres 19: Reduced timing overhead for EXPLAIN ANALYZE with RDTSC
11 April, 2026In today’s E122 of “5mins of Postgres” we're talking about the upcoming Postgres 19 release, and how a change in the Postgres instrumentation handling reduces overhead of timing measurements in EXPLAIN ANALYZE using the RDTSC instruction, and why this will allow turning on auto_explain.log_timing for more workloads. We dive into the recently committed change that I (Lukas) authored together with Andres Freund and David Geier. See the full transcript with examples below. Share this episode:...
Continue readingThe Dilemma of the ‘AI DBA’
11 March, 2026Like many in the industry, my perspective on AI tools has shifted considerably over the past year, specifically when it comes to software engineering tasks. Going from “this is nice, but doesn’t really solve complex tasks for me” to “this actually works pretty well for certain use cases.” But the more capable these tools become, the sharper one dilemma gets: you can hand off the work, but an AI agent won’t ultimately be responsible when the database goes down and your app stops working. For...
Continue readingWaiting for Postgres 19: Better Planner Hints with Path Generation Strategies
05 February, 2026In today’s E121 of “5mins of Postgres” we're talking about the upcoming Postgres 19 release, and how better extensibility of the Postgres planner will enable better plan management and planner hint extensions. We dive into the recently committed "path generation strategies" change, the proposed pg_plan_advice extension it was developed for, and show how pg_hint_plan can evolve and lose 2,500 lines of code! Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up...
Continue readingpganalyze Index Advisor is Now Cluster-Aware
04 February, 2026Before now, pganalyze Index Advisor has checked for insights on Postgres databases one server at a time. If you had a primary with three read replicas, you'd get four separate sets of index recommendations, each treating the server workload in isolation. This works for single instance or standalone systems, but it creates problems for scaled out Postgres clusters with replicas that receive read queries. An unused index on your primary might be critical for queries running on your replicas. And...
Continue readingExtended 100 day retention in pganalyze, and the new storage format that made it work
09 January, 2026When it comes to monitoring your database, historic context matters. When we have an incident with a slow query, we want to know: Was this query always slow, or did it regress with a recent code change? At pganalyze, we have developed a purpose-built product for optimizing Postgres. And we love to use Postgres ourself, so our internal timeseries storage is also based on Postgres. But storing timeseries data in Postgres can be challenging: Once you get to a lot of rows (some of our customers...
Continue readingAWS Performance Insights has been deprecated: What to know about CloudWatch Database Insights
06 November, 2025Editor's note (May 2026): AWS has extended the Performance Insights deprecation date to June 30, 2026, from the originally announced November 30, 2025. The post below has been updated to reflect the new timeline; the functionality and pricing comparisons remain accurate. AWS is deprecating the built-in performance monitoring tool in Amazon RDS and Aurora, Performance Insights, on June 30, 2026. AWS customers are asked to migrate to the new Database Insights functionality in CloudWatch before...
Continue readingHow we used pg_query to rewrite queries to fix bad query plans
06 October, 2025Rewriting SQL queries programmatically is harder than it looks. As a human, adding an extra AND condition to a WHERE clause is simple enough. But doing the same thing in code quickly gets complicated. You might try regex, but the real difficulty is coming up with a pattern that works for every variation of a query. AI could generate plausible rewrites, but it's hard to guarantee correctness. These rewrites may look valid, but SQL has many subtle corner cases, so it's difficult to prove that the...
Continue readingIntroducing pganalyze Query Advisor: Proactive Query Plan Optimization for Postgres
29 September, 2025Postgres teams know the pain of one slow query snowballing into latency or downtime. Some have experimented with GenAI analysis, but it is too expensive to run continuously in the background. Others rely on general monitoring tools, but those cannot detect Postgres-specific patterns in query plans. What’s missing is a purpose-built, cost-effective solution that understands Postgres deeply and gives you actionable guidance. At pganalyze we create purpose-built systems that address common...
Continue readingWaiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O
07 May, 2025With the Postgres 18 Beta 1 release this week, a multi-year effort and significant architectural shift in Postgres is taking shape: Asynchronous I/O (AIO). These capabilities are still under active development, but they represent a fundamental change in how Postgres handles I/O, offering the potential for significant performance gains, particularly in cloud environments where latency is often the bottleneck. While some features may still be adjusted or dropped during the beta period before the...
Continue readingPostgres vs. SQL Server: B-Tree Index Differences & the Benefit of Deduplication
03 April, 2025When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways. In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they...
Continue readingHow Notion Runs PostgreSQL at Scale on Amazon RDS with pganalyze
27 February, 2025We recently sat down with some members of the Notion engineering team to discuss how they manage PostgreSQL at scale on Amazon RDS with pganalyze. As Notion has grown, ensuring database performance and reliability has become increasingly important. In this case study, we explore the challenges they faced, the solutions they implemented, and the impact of using pganalyze to optimize their PostgreSQL operations. You can also find the full case study in our resources section here. Notion blog...
Continue readingComparing EXPLAIN Plans is hard (and how pganalyze does it)
06 February, 2025The Postgres EXPLAIN command is invaluable when trying to understand query performance. SQL is a declarative language, and the Postgres query planner will decide the most efficient way to execute a query. However, plan selection is based on statistics, configuration settings, and heuristics—not a crystal ball. Sometimes there's a substantial gap between what the planner thinks is most efficient and reality. In those situations, EXPLAIN can help Postgres users understand the planner's...
Continue readingReplacing Oracle Hints: Best Practices with pg_hint_plan on PostgreSQL
05 February, 2025If you're migrating from Oracle Database to PostgreSQL, you're likely accustomed to using hints to optimize queries. In Oracle, these are special directives embedded in SQL (like /*+ INDEX(...) */) that steer the optimizer's execution plan. They can be extremely useful but also introduce complexity and “hint debt” over time. PostgreSQL takes a very different approach to query optimization. Rather than supporting built-in hints, the Postgres community, historically, has emphasized relying on its...
Continue readingIntroducing Query Tuning Workbooks: Safely Tune Postgres Queries on Production with pganalyze
08 January, 2025At some point, every engineering team finds itself grappling with the complexity of query optimization. One query may run perfectly well for a particular customer’s parameters, yet degrade performance for another’s dataset. A small tweak that improves latency in staging might have unforeseen consequences in production. Until now, the standard approach to experimentation—perhaps running EXPLAIN ANALYZE in a sandbox environment, copying and pasting results into an editor—hasn’t given developers...
Continue readingTracking Postgres Buffer Cache Statistics over time with pganalyze
05 December, 2024When Postgres accesses data for a query, the tables and indexes first pass through the buffer cache, which is a fixed-size in-memory cache, configured by the shared_buffers setting. Query performance can often rely on whether the query’s data is already in the cache or whether it has to access the underlying disk (or OS page cache) for a significant amount of data. Today, we’re announcing the new Buffer Cache Statistics feature in pganalyze, which relies on the pg_buffercache extension to show...
Continue readingIntroducing Postgres Plan Statistics in pganalyze for Amazon Aurora
21 November, 2024At pganalyze we've offered query performance monitoring of Postgres databases for many years now, helping companies at scale ensure their Postgres database is performant and queries are as fast as possible. One common story we hear when it comes to analyzing Postgres performance, and identifying the root cause of slowdowns is: Has my query plan changed? Recently Amazon Aurora, the highly scalable AWS PostgreSQL service, has made execution plan data more readily available by introducing...
Continue readingPostgres Planner Quirks: Incremental Sort, and when it goes wrong
01 October, 2024In today’s E120 of “5mins of Postgres” we're returning to our Postgres planner quirks series to talk about Incremental Sort, and when it goes wrong. Incremental Sort can often speed up query plans when you have an existing sort order; however, there can be edge cases where the planner chooses a sub-optimal plan. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript To be clear, Incremental sort...
Continue readingConstraint Programming in Action: Optimizing Postgres Index Selection
01 August, 2024In a previous article we introduced constraint programming (CP), a declarative paradigm to solve discrete optimization problems. We discussed some theory, and considered a realistic example by constructing a work schedule for a small store. In this follow-up article, we’ll now take our understanding of constraint programming, and apply it to a real-world example: Selecting optimal indexes in Postgres. The model presented in this article is similar to the CP model that was presented at PGDay...
Continue readingA practical introduction to Constraint Programming using CP-SAT and Python
02 July, 2024Imagine you're an e-commerce giant that would like to build a new warehouse to improve service to your customers, but you need to know what is the best location for it. Or you're a global shipping company that assigns packages to their delivery trucks and has to choose the best routes in order to save gas and reduce driver overtime. Or an airline that is looking to offer service to a new location, and needs to know which types of planes they should use and on what schedule, to maximize the...
Continue readingThe surprising logic of the Postgres work_mem setting, and how to tune it
16 June, 2024In today’s E119 of “5mins of Postgres” we discuss tuning the Postgres work_mem setting for your workload, and why it can be quite confusing to interpret the meaning of work_mem correctly for a given query. We also discuss the impact of hash_mem_multiplier, and why it's default changed from 1.0 to 2.0 in Postgres 15. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll start with this...
Continue readingWaiting for Postgres 17: Better Query Plans for Materialized CTE Scans
09 June, 2024In today’s E118 of “5mins of Postgres” we discuss two changes in the upcoming Postgres 17 release that improve query plans for queries that involve CTEs. This can improve query plans where you would see an explicit CTE scan, due to use of the MATERIALIZED keyword, or because Postgres wasn't able to pull up a query to the upper plan level. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript...
Continue readingPostgres Planner Quirks: JOIN Equivalence Classes and IN/ANY filters
02 June, 2024In today’s E117 of “5mins of Postgres” we continue our series on Postgres planner quirks. Today, we discuss JOIN column equivalence and when there are issues with IN/ANY filters not being considered as part of the equivalence class in Postgres. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll start with this blog post by Deepak Mahto on his personal blog. In this post, Deepak...
Continue readingWhat's (not) in Postgres 17 beta1, and how to test it
25 May, 2024In today’s E116 of “5mins of Postgres” we discuss the Postgres 17 beta1 release that came out earlier this week, some highlights of the release, some changes that unfortunately got reverted since the feature freeze, and how to help the community during the beta testing process. Share this episode: Click here to share this episode on LinkedIn. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript A summary of the Postgres release schedule A quick reminder on...
Continue readingPostgres Planner Quirks: How to fix bad JSONB selectivity estimates
18 May, 2024In today’s E115 of “5mins of Postgres” we discuss a common challenge when using equality and contains operators for querying JSONB columns, and why the Postgres planner will often get the estimates wrong for such queries, causing bad query plans. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll start with this blog post by Eric Fritz on the Render blog. The problem: A...
Continue readingPostgres CVE-2024-4317 and how to fix the system views
11 May, 2024In today’s E114 of “5mins of Postgres” we discuss the updated Postgres minor releases that were just released this week. And specifically, we're going to talk about the security issue that was fixed in this release. Now I'll tell you first off, that this security issue, CVE-2024-4317 is a fairly minor one, but the reason I want to talk about it is, because I was the one who reported it. I'll explain to you when this can be a problem and how you can apply the fix. If you want to fix this issue,...
Continue readingPostgres Planner Quirks: The impact of ORDER BY + LIMIT on index usage
04 May, 2024Today we're going to start a new series on 5mins of Postgres called "Postgres Planner Quirks". When we say "quirks", we mean odd behavior that might make sense to you if you're a Postgres hacker, but certainly is confusing when you're the end user, like a DBA, data platform, engineer, or application developer. In today’s E113 of “5mins of Postgres” we discuss a commonly encountered Postgres planner quirk, which is how Postgres behaves when you have a LIMIT and an ORDER BY clause, and it picks...
Continue readingWaiting for Postgres 17: Streaming I/O for sequential scans & ANALYZE
25 April, 2024In today’s E112 of “5mins of Postgres” we discuss streaming and vectored I/O in Postgres 17. This is an important step towards implementing asynchronous I/O in Postgres. In Postgres 17 we do not yet have asynchronous I/O, but we do see a performance benefit from what is essentially a refactoring work around the internal APIs. In this episode, I give you a little bit of a behind the scenes view on my understanding of the changes made in 17 and how they impact future asynchronous I/O work. Share...
Continue readingWaiting for Postgres 17: Faster B-Tree Index Scans for IN(...) lists and ANY =
21 April, 2024In today’s E111 of “5mins of Postgres” we discuss faster B-tree index scans in Postgres 17 for queries that involve IN lists or other cases where multiple array values are being passed to Postgres (ScalarArrayOpExpr). We show how even simple cases now avoid repeated page access, and how turning filters into index conditions and processing like an Index Skip Scan can yield significant speedups for certain queries. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter....
Continue readingWaiting for Postgres 17: Benchmarking with pg_buffercache_evict
12 April, 2024In today’s E110 of “5mins of Postgres” we discuss the pg_buffercache_evict function recently committed to Postgres 17, that lets you remove individual pages from the Postgres shared buffer cache. We show how this can be used for testing query performance on a test system, and the impact of double buffering. We also discuss how to flush the OS page cache for a given table to run a benchmark on a query and avoid confusing EXPLAIN (ANALYZE, BUFFERS) output. Share this episode: Click here to share...
Continue readingWaiting for Postgres 17: Faster VACUUM with Adaptive Radix Trees
06 April, 2024In today’s E109 of “5mins of Postgres” we discuss a recently committed change to the Postgres 17 development branch that introduces an improved dead tuple storage for autovacuum based on adaptive radix trees. This significantly reduces autovacuum memory consumption and avoids the need for multiple index vacuum phases. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript As...
Continue readingWaiting for Postgres 17: Improved EXPLAIN for SubPlan nodes
30 March, 2024In today’s E108 of “5mins of Postgres” we discuss a recently committed change to the Postgres 17 development branch that improves how EXPLAIN represents SubPlan and InitPlan nodes. We compare the output with 16, and discuss the background of how a subplan can result from a sub-SELECT, and when it results in a regular JOIN instead. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. ...
Continue readingWaiting for Postgres 17: The new built-in C.UTF-8 locale
22 March, 2024In today’s E107 of “5mins of Postgres” we discuss a recently committed change to the Postgres 17 development branch that adds a built-in collation provider to Postgres, as well as a new built-in C.UTF-8 locale that allows fast binary sorting, whilst supporting Unicode-aware operations on texts. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll start with this commit by...
Continue readingHow Figma built DBProxy for sharding Postgres
15 March, 2024In E106 of "5mins of Postgres" we discuss how Figma scaled out their Postgres installation by 100x over 4 years, and recently switched to horizontal sharding using their DBProxy query proxy. We also compare their approach to Notion's sharding setup, as well as the Citus extension for Postgres. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll start with this blog post...
Continue readingTuning random_page_cost and how index correlation affects query plans
08 March, 2024In today’s E105 of “5mins of Postgres” we discuss why changing random_page_cost from the default of 4 is usually a good idea, and a specific example of where a high random_page_cost caused a bad plan due to index correlation. We dive into the relevant parts of the Postgres source, and explain how planner costing works. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript We'll...
Continue readingWaiting for Postgres 17: Configurable SLRU cache sizes for increased performance
01 March, 2024In today’s E104 of “5mins of Postgres” we discuss a recently committed change to the Postgres 17 development branch that allows configuring the size of the Postgres SLRU caches, as well as improvements to LWLock contention on SLRU related locks. We review the background on what SLRU caches do in Postgres, and historic performance challenges experienced at scale. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and...
Continue readingIntroducing pganalyze Index Advisor 3.0 - A workload-aware system for finding missing indexes in Postgres
29 February, 2024Indexing your database across a diverse set of queries across an entire table can be a challenge. You need to determine what indexes might be missing, how new indexes will interact with existing ones, and the overall impact on the specific workload. Continuously adapting indexes is difficult and time consuming; you might deal with a large number of tables, queries, and existing indexes, each with its own characteristics. For example, balancing the need for indexes against the potential...
Continue readingSpeeding up partial COUNT(*) in Postgres by using LIMIT in a subquery
22 February, 2024In today’s E103 of “5mins of Postgres” we look at optimizing the performance of COUNT statements when only a subset of the data needs to be counted, through use of a LIMIT in a sub-SELECT, or what I like to call a "limited COUNT". We also discuss how this can be represented effectively in a web application. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript In this blog post...
Continue readingOut of range planner statistics and "get_actual_variable_range" in Postgres
15 February, 2024In today’s E102 of “5mins of Postgres” we talk about how Postgres handles situations where planner statistics are out of date. We look at the logic for deciding when the planner goes to the actual indexes on a table to get the maximum value using the get_actual_variable_range function, and performance implications of this approach. Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. ...
Continue readingFaster query plans with Postgres 16: Incremental Sorts, Anti-JOINs and more
08 February, 2024In today’s E101 of “5mins of Postgres” we talk about the planner improvements in Postgres 16, including better use of Incremental Sort, more efficient anti-JOINs, Hash Join improvements, more efficient window functions, and more. And all of that without having to change your queries! Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript In this blog post by David Rowley on the...
Continue readingReducing table size with optimal column ordering and array-based storage
01 February, 2024In today’s E100 of “5mins of Postgres” we discuss how to optimize the row size in Postgres to reduce the overall table size. Specifically, we take a look at how column ordering affects storage size, how you can play column tetris to optimize it, and the big benefits that array-based storage can yield for some workloads. 🎉 This is episode 100. And over the last, almost two years, it's been great every week to look at what's new with Postgres or what are good best practices that are still...
Continue readingFinding the root cause of locking problems in Postgres
25 January, 2024In today’s E99 of “5mins of Postgres” we're showing how to go about finding the source of a locking problem in Postgres. Specifically, we are looking at how to debug heavyweight locks, how to end a process that’s holding a lock in Postgres, and general helpful settings for handling Postgres locking. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Debugging a lock problem in...
Continue readingHow to partition Postgres tables by timestamp based UUIDs
18 January, 2024In today’s E98 of “5mins of Postgres” we're going to talk about partitioning Postgres tables by timestamp based UUIDs. We're also going to talk about the status of UUIDv7 in the current Postgres development branch. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Partitioning Postgres tables by ULID To start with, this is a forum post by Chris O'Brien on the Elixir forum....
Continue readingThe different trade-offs of Distributed Postgres architectures
12 January, 2024Happy new year! We hope you are doing well and are excited to be back with our “5mins of Postgres” video series. In E97 we're going to talk about distributed Postgres architectures. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Understanding distributed Postgres This is a blog post by Marco Slot on the Crunchy Data blog. For context, Marco is a principal author of the...
Continue readingIntroducing pg_query for Postgres 16 - Parsing SQL/JSON, Windows support, PL/pgSQL parse mode & more
11 January, 2024Parsing SQL queries and turning them into a syntax tree is not a simple task. Especially when you want to support special syntax that is specific to a particular database engine, like Postgres. And when you’re working with queries day in day out, like we do at pganalyze, understanding the actual intent of a query, which tables it scans, which columns it filters on, and such, is essential. Almost 10 years ago, we determined that in order to create the best product for monitoring and optimizing...
Continue readingWaiting for Postgres 17: Incremental base backups
21 December, 2023In E96 of “5mins of Postgres”, we talk about the incremental backup feature that was just committed to the Postgres 17 development branch. To be clear, this is very much off the presses and this may yet change or be removed completely. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Introducing incremental backups in Postgres 17 I was very excited earlier this week when I...
Continue readingZero downtime Postgres upgrades and how to logically replicate very large tables
14 December, 2023In E95 of “5mins of Postgres”, we talk about zero downtime Postgres upgrades and how to logically replicate very large tables. It’s an exciting episode packed with lots of real world experiences and examples from GitLab and Instacart! Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Using logical replication to update Postgres with zero downtime First of all, we'll start with...
Continue readingThe un-fun work of making Postgres FIPS compliant
07 December, 2023In today’s episode 94 of “5mins of Postgres”, we’ll walk through Postgres and FIPS. We’ll explain what the FIPS mode is, where it is required, how the MD5 algorithm presents issues, and more. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Postgres and FIPS mode This is a blog post by Peter Eisentraut. In it, he talks about the work that he's done in Postgres upstream to...
Continue readingExploring Postgres VACUUM with the VACUUM Simulator
04 December, 2023For many of us, how VACUUM works in Postgres and when autovacuum triggers it is not easy to understand. As you start digging into it, you’ll discover more and more questions. And when you have many tables, it’s hard to keep track of it all. Luckily, Postgres provides a lot of information that we can use to better understand its internals. For example, we can look at a table's n_dead_tup value to understand the current number of dead tuples, and relate this to the autovacuum scale factor and...
Continue readingAvoiding deadlocks in Postgres migrations
30 November, 2023In today’s episode 93 of “5mins of Postgres”, we're talking about how to avoid deadlocks in Postgres migrations and surprising lock behavior with migrations that I personally encountered. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Deadlocks and other surprising behavior in Postgres migrations To start with, this is a blog post by Vanessa Ung on the Dovetail engineering...
Continue readingHow to use CREATE STATISTICS to improve a plan rows estimate, and why it matters
23 November, 2023In E92 of “5mins of Postgres”, we're going to talk about creating functional dependency statistics to improve plan cost estimates in Postgres. We’re looking at a problem that I myself encountered last week when I was trying to optimize the estimates we're getting for a particular query inside our own system. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Which type of...
Continue readingGitLab's challenge with Postgres LWLock lock_manager contention
16 November, 2023In E91 of “5mins of Postgres”, we're going to talk about lock_manager LWLock contention in Postgres and how the team at GitLab has addressed this production problem. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript Debugging a problem with LWLock lock_manager contention We'll start with this issue by Matt Smiley from the GitLab team. I want to first mention that it's great...
Continue readingThe pg_stat_checkpointer view in Postgres 17
09 November, 2023Today, we're going to talk about the new pg_stat_checkpointer in Postgres 17, and how that changed the pg_stat_bgwriter, or background writer, statistics. We are also looking at the impact these changes have on shared buffers, the Postgres WAL, and pg_stat_io. Share this episode: Click here to share this episode on LinkedIn or on twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel. Transcript pg_stat_checkpointer and its impact on pg_stat_bgwriter To start...
Continue readingPerformance implications of medium size values and TOAST in Postgres and how to mitigate them
02 November, 2023Today, we're going to talk about the surprising impact of medium sized texts on Postgres performance. We take a look at a great blog post by Haki Benita. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Understanding medium size values and TOAST performance in Postgres This article by Haki Benita from back in 2020 is a really good post that's worth revisiting. I was discussing recently with a...
Continue readingPgBouncer 1.21 adds prepared statement support in transaction mode
26 October, 2023In this episode, we're going through the new PgBouncer 1.21 release which adds prepared statement support for PgBouncer when running in transaction mode. We talked about this previously in episode 73 of 5mins of Postgres a couple months ago and now this is actually released in a production version of PgBouncer. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript What is transaction mode in PgBouncer?...
Continue readingWhat the new Rails 7.1 brings for Postgres users
19 October, 2023Today, we're going to talk about the new Ruby on Rails 7.1 release and what it brings for Postgres users. This is 5mins of Postgres episode 87. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Rails 7.1 and Postgres Usually, we don't speak about language specific or framework specific changes on 5mins of Postgres because you might be using Go, you might be using Java, you might be using Python, or...
Continue readingHOT Updates and BRIN indexes in Postgres 16
12 October, 2023Today, I want to talk about a new change in Postgres 16 that helps you ignore BRIN indexes when Postgres checks for HOT updates. Join me for 5mins of Postgres E86. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Postgres Heap-Only-Tuples So before we dive into this change in Postgres 16, let's do a quick recap of what HOT updates are. HOT stands for Heap-Only Tuples. That means they're versions of...
Continue readingPostgres I/O Basics, and how to efficiently pack table pages
05 October, 2023In E85 of “5mins of Postgres” we’ll talk about understanding Postgres IOPS and how Postgres structures its physical table structure. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! What is IOPS in PostgreSQL? We'll start with this blog post by John Dalton on the Crunchy Data blog. In his blog post, John explains why looking at I/O performance matters even when your data fits into...
Continue reading4 improvements in Postgres 16 that you might not know about yet
28 September, 2023In E84 of “5mins of Postgres” we’ll talk about four new features in Postgres 16 that you might not know about yet. You’ll learn about pg_input_is_valid, pg_input_error_info, improvements to \watch, number separators in the SQL:2023 standard, and more! Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's dive in! pg_input_is_valid and pg_input_error_info Let’s start with two blog posts on Luca...
Continue readingEXPLAIN (ANALYZE, BUFFERS) and interpreting shared buffers hit in Nested Loops
21 September, 2023In E83 of “5mins of Postgres” we're talking about using EXPLAIN (ANALYZE, BUFFERS) to optimize Postgres queries, and what to know when interpreting the shared hits counter for Nested Loops. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! The basics of EXPLAIN (ANALYZE, BUFFERS) We'll start with a blog post by Will Jessop on his personal blog. If you've been using EXPLAIN (ANALYZE,...
Continue readingPostgres 16 is released!
14 September, 2023In E82 of “5mins of Postgres” we're talking about Postgres 16, which was released today! There are a lot of exciting features in Postgres 16. I want to give you a quick rundown of a few things that I find exciting, as well as a few things that you might have not known about yet that people in the community pointed out in the last couple of weeks. We had talked about some of them previously in 5mins of Postgres episodes. We will link them at the end of this article! Share this episode: Click...
Continue readingHow partition-wise joins and aggregates improve query performance
17 August, 2023In E81 of “5mins of Postgres” we're talking about how partition-wise joins can improve your Postgres query performance. We also look at partitionwise-aggregates and their impact on query planning time and query execution time. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! Benchmarking partitioning options in Postgres This is a blog post by Ashutosh from the EDB team where he ran a...
Continue readingOptimizing bulk loads in Postgres, and how COPY helps with cache performance
10 August, 2023In episode 80 of “5mins of Postgres” we're going to talk about optimizing bulk load performance in Postgres and how the COPY ring buffer works. We're comparing different methods of INSERTs and show why COPY is the fastest option. We are also looking at pgbench and pgbuffercache to show client side vs server side performance. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's take a look! In this...
Continue readingHandling outliers in BRIN indexes with the new multi minmax operator class
03 August, 2023In E79 of “5mins of Postgres” we're going to talk about when good correlation is not enough, the cases where BRIN indexes can be worse than having no index at all. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! When using an index can be slower than not using one This is a blog post by Haki Benita where he describes a particular situation where a BRIN index performed really badly....
Continue readingPartitioning in Postgres and the risk of high partition counts
27 July, 2023In E78 of “5mins of Postgres”, we're going to talk about partitioning in Postgres and how it can either kill performance or make it a lot better. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! Decreased Postgres performance by using partitioning We'll start with this blog post by Hans-Jürgen Schönig on the Cybertech blog. In this article, Hans-Jürgen shows how you can get...
Continue readingIntroducing pganalyze VACUUM Advisor: Workload-aware autovacuum tuning for Postgres
25 July, 2023VACUUM in Postgres is a fact of life - every Postgres installation out there today, including those on AWS Aurora and Google AlloyDB, has to run autovacuum on each table at least every 2 billion transactions to perform freezing. And if you'd like to keep the active portion of your tables in memory, a frequent vacuuming cycle ensures tables stay small, and queries remain fast, with minimal bloat. Oftentimes, when your database is still small, you don't have to think about VACUUM and autovacuum....
Continue readingHow Table Access Methods in Postgres may offer a future without Bloat
20 July, 2023In episode 77 of “5mins of Postgres”, we are going to take closer look at VACUUM and a potential future without bloat in your database thanks to table access methods. In detail, we are looking at Alexander Korotkov's OrioleDB. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's get to it! VACUUM and table bloat in Postgres This is a blog post by Alexander Korotkov on the OrioleDB blog. It is an...
Continue readingHow to optimize correlated subqueries in Postgres
13 July, 2023In episode 76 of “5mins of Postgres”, we are going to talk about how to optimize subqueries in Postgres by understanding the Postgres planner better. We look at correlated vs. uncorrelated subqueries, as well as scalar subqueries vs. tabular subqueries. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's dive in! How Postgres subqueries are used in a query This is a blog post by Laurenz Albe on...
Continue readingPostgres performance with IN vs ANY
06 July, 2023In E75 of “5mins of Postgres” we're going to talk about a surprising case Kaarel Moppel ran into, where using "IN" is faster than "ANY". We investigate this edge case in detail and are also looking at Tom Lane's response, who is one of the main authors of a lot of Postgres planner logic, to Kaarel's case. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! A surprising performance...
Continue readingStoring and querying vector data in Postgres with pgvector
29 June, 2023In today’s episode 74 of “5mins of Postgres” we're going to talk about vectors in Postgres. We are having a look at the claim that vectors are the new JSON in Postgres, are talk about Andrew Kane’s pgvector, and what work is needed in the Postgres core to support vectors and ML and AI work in Postgres better. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump right in! How Postgres has...
Continue readingSpeed up Postgres with transaction pooling with prepared statements
22 June, 2023In E73 of “5mins of Postgres” we're talking about how to use transaction pooling with prepared statements to make Postgres up to 30% faster. We're looking at improvements to pgbouncer, PgCat, and other connection poolers. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! Enabling Prepared Statements with Transaction Pooling for 30% Faster Queries We'll start with this blog post by...
Continue readingHow to use HOT Updates in Postgres to make CLUSTER work better
15 June, 2023In E72 of “5mins of Postgres” we're gonna talk about how to use HOT updates to use CLUSTER more effectively, and how we can utilize CLUSTER to get the last bit of performance out of an index scan. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump right in! Utilizing CLUSTER to speed up index scans in Postgres This is a blog post by Laurenz on the Cybertec blog. In this blog post, Laurenz...
Continue readingHow Figma and Notion scaled Postgres
08 June, 2023In episode 71 of “5mins of Postgres” we're going to talk about the growing pains of database architecture at Figma as well as alternate architectures, such as the sharding architecture chosen by Notion. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's dive in! How Figma scaled Postgres by partitioning tables between servers This is a blog post by Tim from the Figma engineering team from earlier...
Continue readingHow bulk loading with COPY can be 300% times faster in Postgres 16 Beta 1
01 June, 2023In episode 70 of “5mins of Postgres” I want to talk about Postgres 16 Beta 1 being released and how one of its improvements can allow you to bulk load about 300% times faster with COPY. The reason for this being an improvement to relation extension locks. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! Getting started with Postgres 16 Beta 1 Last week, the first beta release of...
Continue readingTuning huge pages in Postgres
25 May, 2023In E69 of “5mins of Postgres” we're going to talk about tuning huge pages in Postgres. We’ll explain what the Translation Lookaside Buffer is and how, starting with Postgres 15, Postgres can help you calculate how many huge pages to allocate. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! What are huge pages in Postgres? This blog post by Bernd on the Cybertec blog describes how...
Continue readingA Postmortem on High Memory Usage with Prepared Statements
18 May, 2023In today’s episode 68 of “5mins of Postgres” we're going to talk about how using partitioning together with prepared statements can lead to out of memory problems. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! How partitioning together with prepared statements can cause high memory usage in Postgres This is a blog post by Christopher on the Prefect blog, where he describes a...
Continue readingForcing Join Order in Postgres Using Optimization Barriers
11 May, 2023In today’s episode 67 of “5mins of Postgres” I want to talk about forcing join order in Postgres. We’ll walk through 3 techniques: Adding OFFSET “0” to a sub-SELECT, using common table expressions, and working with the join_collapse_limit. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! How to force the join order in Postgres with an optimizer barrier This is a blog post by...
Continue readingPostgres 16: Logical decoding on standbys
04 May, 2023In E66 of “5mins of Postgres” we're talking about logical decoding on a standby in Postgres 16, which allows us to keep following the logical replication stream whilst a failover happens. We’ll also touch upon the new pg_log_standby_snapshot function. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump right in! Using logical decoding on standbys in Postgres 16 This is a blog post by Bertrand...
Continue readingPostgres 16: Running EXPLAIN on any query (even with $1 parameters!)
27 April, 2023In episode 65 of “5mins of Postgres” we're going to talk about explaining generic plans in Postgres 16. We're looking at generic plans vs. custom plans, the benefits of generic plans and the limitations of EXPLAIN (GENERIC_PLAN). Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! EXPLAIN (GENERIC_PLAN) in Postgres 16 This is a blog post by Laurenz Albe on the Cybertec blog, where he...
Continue readingPostgres 16: Buffer cache hit ratio and I/O times in pg_stat_io
20 April, 2023This is E64 of “5mins of Postgres”. Today, we talk about two improvements to the new pg_stat_io view in Postgres 16: tracking shared buffer hits and tracking I/O time. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! Tracking shared buffer hits and I/O times in pg_stat_io We previously spoke about pg_stat_io in a prior 5mins of Postgres episode, as well as this blog post. Just for...
Continue readingVacuum Cost Limit and Parallel Aggregate improvements in Postgres 16
13 April, 2023In episode 63 of “5mins of Postgres” we look at two new exciting features in Postgres 16: Updating the cost limit on the fly and using aggregate functions in parallel. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! Postgres 16: Enhancements for vacuum and parallel aggregate As of last Saturday (April 8th, 2023), the feature freeze for Postgres 16 has begun. This means that now no...
Continue readingPL/Rust 1.0 and its trusted language mode
06 April, 2023In E62 of “5mins of Postgres” we talk about PL/Rust, its 1.0 release and how it implements trusted language support in Postgres. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! PL/Rust: Running custom Rust code in Postgres PL/Rust, for those of you who are not familiar, is a way to use Rust to create your user defined functions in Postgres. Similarly to how you could use PL/pgSQL or...
Continue readingCollations in Postgres
30 March, 2023In episode 61 of “5mins of Postgres” we’re talking about collations in Postgres. We’ll specifically take a look at glibc and ICU collations, and problems that can arise when using different operating system releases. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! Working with collations In this blog post by Jeremy from the AWS team, he talks about a hypothetical situation where...
Continue readingPostgres connection pooling: Comparing PgCat and pgbouncer
23 March, 2023In episode 60 of “5mins of Postgres” we’re talking about PgCat, a new connection pooler for Postgres, and how it compares to pgbouncer, which a lot of folks currently use. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's get to it! PgCat - A new Postgres connection pooler First, I encourage you to read through the original Hacker News post by Lev from the PostgresML team where he announced...
Continue readingUUIDs vs Serial for Primary Keys - what's the right choice?
16 March, 2023In E59 of “5mins of Postgres” we’re talking about UUIDs vs Serials for primary keys. In our last episode, we talked about what happens if you use a regular four byte integer for your primary key and you then run out of space. Today, I want to talk about the scenario where you have made the right choice to use bigints in your system, but you're trying to decide whether you want to use UUIDs going forward. Or the other way around, if you are currently using UUIDs, and you're unsure if that's the...
Continue readingHandling integer sequence overflow without downtime in Postgres
09 March, 2023In episode 58 of 5mins of Postgres we're going to talk about handling integer sequence overflow in Postgres without downtime. We show how you can avoid integer overflow and how you can fix it once you've ran into it. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump in! What happens when a sequence reaches max value in Postgres? In this blog post, Jesse Soyland from the Crunchy Data team...
Continue readingHow Postgres DBAs can use pg_stat_io
01 March, 2023In today’s episode 57 of 5mins of Postgres I want to talk about the pg_stat_io view and the everyday DBA perspective on why it matters. We’ll go through tracking activity in Postgres with pg_stat_io and talk about getting insights into extends, evictions, reuses, and more. The pg_stat_io view is a new feature in the upcoming Postgres 16 release that was committed about three weeks ago. This was committed by Andres Freund, and was authored by Melanie Plageman. Share this episode: Click here to...
Continue readingUnlogged tables in Postgres: Is their data truly gone after a crash?
23 February, 2023In today’s E56 of 5mins of Postgres we’re talking about unlogged tables, reasons to use them for increased write performance, reduced vacuum impact and less total WAL, and reasons to not use them, because data is lost when Postgres crashes. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's dive in! What is an unlogged table in Postgres? First of all, we'll start with this blog post by Greg...
Continue readingAvoid Postgres performance cliffs with MultiXact IDs and foreign keys
16 February, 2023In episode 55 of 5mins of Postgres we talk about performance cliffs encountered with MultiXact transaction IDs. We’ll also look at foreign key relationships and the Postgres VACUUM and autovacuum process. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! Avoiding performance cliffs with MultiXact IDs and foreign keys We'll start with this blog post by Christophe Pettus who writes...
Continue readingPostgres 16: Cumulative I/O statistics with pg_stat_io
14 February, 2023One of the most common questions I get from people running Postgres databases at scale is: How do I optimize the I/O operations of my database? Historically, getting a complete picture of all the I/O produced by a Postgres server has been challenging. To start with, Postgres splits its I/O activity into writing the WAL stream, and reads/writes to the data directory. The real challenge is understanding second-order effects around writes: Typically the write to the data directory happens after...
Continue readingNever run VACUUM FULL: How to run pg_repack on Amazon RDS and Aurora
09 February, 2023In today’s E54 of 5mins of Postgres we talk about why you shouldn't use VACUUM FULL and how to use the alternative, pg_repack, on Amazon RDS and Aurora. We walk through pg_repack’s benefits, how to install it, and how it helped making one of our tables over 10x smaller. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's take a look! Why you should use pg_repack over VACUUM FULL in Postgres In...
Continue readingTracing Locks in Postgres using eBPF programs
02 February, 2023In episode 53 of 5mins of Postgres, we talk about tracing Postgres locks using eBPF programs. Specifically, we’re looking at pg_lock_tracer and pw_lw_lock_tracer by Jan Nidzwetzki. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's have a look! pg_lock_tracer: An overview In this blog post, Jan from the TimescaleDB team describes two projects that he started over the holiday break. The programs...
Continue readingPostgres 16: Surviving without a superuser & reserved_connections
26 January, 2023In episode 52 of 5mins of Postgres, we talk about the createrole_self_grant option in Postgres 16, which lets us create a role that can administer the database but not break out of it. We also talk about the pg_use_reserved_connections role. All in all, Postgres 16 and above will make it easier to manage roles consistently and make sure that you grant the right permissions to the right users. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and...
Continue readingUsing Memoize to speed up joins between two Postgres tables
19 January, 2023This is episode 51 of 5mins of Postgres, and today we're going to talk about how to use Memoize to speed up joins between two tables. We’re looking at presentations from Bruce Momjian and Lukas Eder and quickly explain a bug with Memoize in Postgres versions earlier than 14.4. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's get started! What is the Memoize plan node in Postgres? This episode...
Continue readingUsing pgbench to load data faster, and the random_normal function in PG16
12 January, 2023Today, in episode 50 of 5mins of Postgres, we're going to talk about two things. We're going to talk about Kaarel Moppel's post on how to generate lots of test data with Postgres, using pgbench to load data fast and faster, and we're going to talk about the new "random_normal" function that was recently added to Postgres 16. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's start this off! Why...
Continue readingSpeed up Postgres queries with UNIONs and subquery pull-up
22 December, 2022Today, we're gonna talk about Postgres UNION. We'll look into the differences of Postgres UNION vs. OR and are investigating if using UNION is always the better choice. Quick answer: It is not. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's jump right in! Postgres UNION Performance Today, we're gonna talk about Postgres UNION. This is a blog post that Laurenz Albe written earlier this week...
Continue readingBenchmarking multi-column, covering and hash indexes in Postgres
15 December, 2022Today, we talk about benchmarking multi-column vs. multiple combined indexes in Postgres, and whether B-Tree or hash indexes are a better fit. We also look into cases and workloads where different indexes are better than other ones. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel. Transcript Let's start benchmarking! Benchmarking indexes in Postgres In this blog post by Kaarel Moppel he's referencing another...
Continue readingRoaring Bitmaps and pgfaceting: Fast counting across large datasets in Postgres
08 December, 2022In E47 of 5mins of Postgres, we talk about counting and faceting large result sets using Roaring Bitmaps in Postgres. We are looking at the pg_roaringbitmap extension and the pgfaceting extension which help us improve our query performance from 222 seconds to 155 milliseconds. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump right in! Faster Postgres...
Continue readingLock monitoring in Postgres: Find blocking queries in a lock tree with pganalyze
01 December, 2022Postgres databases power many mission critical applications, and applications expect consistent query performance. If even a single query takes longer than expected, it can lead to unhappy users, or delayed background processes. We can use EXPLAIN to debug a slow query, but there is one Postgres problem it won't tell us about: Blocked queries. You may also know this as "blocked sessions" from other database systems. This is when one query holds a lock on a table and the other is waiting for...
Continue readingCreating custom extensions on Amazon RDS and Aurora with pg_tle
01 December, 2022In today's episode 46 of 5mins of Postgres, we talk about trusted language extensions and how to use them on Amazon RDS and Aurora, as well as the new pg_tle extension. We also take a look at pg_tle's support for the "password check hook". Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look! Trusted language extensions for Postgres This is an...
Continue readingPostgreSQL ANY vs IN lists: Bind parameters and performance
24 November, 2022Today, we look at the difference between PostgreSQL's ANY and IN operators, how they differ in performance, and why you might use one or the other. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in! PostgreSQL security: ANY operator vs. IN lists We will start with this blog post by Matt Hudson on the Crunchy Data blog. Matt gives a good introduction...
Continue reading5mins of Postgres E44: Reducing replication lag with maintenance_io_concurrency in Postgres 15
17 November, 2022Today, we're talking about reducing replication lag by setting the maintenance_io_concurrency setting in Postgres 15. We're also talking about true asynchronous I/O in Postgres and how to use "recovery prefetch". Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look! Using maintenance_io_concurrency to reduce replication lag In this article by Thomas...
Continue reading5mins of Postgres E43: Logical replication in Postgres 15 & why the REPLICA IDENTITY matters
10 November, 2022Today, we're going to talk about logical replication improvements in Postgres 15. In particular, we’ll talk about improvements to data filtering, why the REPLICA IDENTITY is important, and how logical replication deals with errors. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in! Postgres 15 logical replication improvements In this blog post, Amit...
Continue reading5mins of Postgres E42: A surprising case of very large integers causing a Sequential Scan
03 November, 2022Today, we talk about a surprising case of a very large integer causing a Sequential Scan in Postgres. There are cases when Postgres decides to not use an index, but instead opts for a Sequential Scan. This can cause some very real performance problems. Let’s have a look at why it does this and ways to resolve this issue! Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. ...
Continue reading5mins of Postgres E41: Tuning shared_buffers for OLTP and data warehouse workloads & is 25% of RAM a good choice?
27 October, 2022In this episode of 5mins of Postgres, we’re going to talk about tuning shared_buffers, benchmarking with pgbench, TPROC-C and TPROC-H, and whether using 25% of RAM is a good choice. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump right in! Tuning shared_buffers in Postgres I want to start with this tweet by Andres Freund from a couple of years ago. In...
Continue reading5mins of Postgres E40: How to debug slow Postgres UPDATEs and bloated tables with auto_explain & page_inspect
20 October, 2022Today, we talk about why Postgres UPDATEs are getting slower over time, HOT updates, auto_explain, the pageinspect extension, and reasons why VACUUM won’t remove dead rows from a table. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's get to it! Debugging slow Postgres UPDATEs and bloated tables This blog post by Laurenz from the Cybertec team talks about a...
Continue reading5mins of Postgres E39: Postgres 15 release, and 6 tips for managing indexes
13 October, 2022Today, we're going to talk about the Postgres 15 release, and six findings to improve how you manage your Postgres indexes. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in! Postgres 15 is released So, first of all, in the words of Postgres committer Tom Lane: "Drat, forgot this …". Postgres 15 has been tagged and is released today, go take a look...
Continue reading5mins of Postgres E38: When to use BRIN indexes in Postgres, tuning pages_per_range and determining correlation with pg_stats
06 October, 2022In this article and video, we talk about tuning a Postgres BRIN index and how to decide whether to use BRIN based on pg_stats statistics and the correlation column. We also talk about the impact block ranges can have and a visual model on how to think about BRIN indexes. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's get into it! How to tune BRIN indexes...
Continue reading5mins of Postgres E37: New in Postgres 15: Logical replication column filters & database-wide ICU collations
29 September, 2022Today, we talk about new features in Postgres 15, including logical replication improvements and database wide ICU locales. In case you're not aware, Postgres 15 rc1 is around the corner and we can expect the final release of Postgres 15 within the next couple of weeks. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in! Using column lists in logical...
Continue reading5mins of Postgres E36: Optimizing Postgres query throughput in Python using pipeline mode and psycopg 3.1
22 September, 2022Today, we're going to talk about the new psycopg driver for Python. We're also going to talk about pipeline mode in psycopg and how it works. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in! psycopg 3.1 for Python and Postgres First of all, I want to give a quick shout-out to Daniele Varrazzo who has been maintaining pyscopg and has started the new...
Continue reading5mins of Postgres E35: How to run EXPLAIN on normalized queries from pg_stat_statements using plan_cache_mode
15 September, 2022In today’s episode we're going to talk about how to run EXPLAIN on a normalized query from pg_stat_statements and how to get the generic query plan. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look! Postgres EXPLAIN plans and pg_stat_statements This blog post by Franck from Yugabyte talks about how you can get an EXPLAIN plan from a query that...
Continue reading5mins of Postgres E34: Making the most of Parallel VACUUM and why SQL/JSON was pushed out to Postgres 16
08 September, 2022In this episode, we're walking through parallel VACUUMing in Postgres 13 and newer. We're also going to talk about the unfortunate fact that SQL/JSON won't be in Postgres 15. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive right in! VACUUM and Parallel VACUUM in Postgres Starting with parallel VACUUM, this blog post by the database team from AWS goes...
Continue reading5mins of Postgres E33: Postgres on Kubernetes, choosing the right operator, and handling major version upgrades
01 September, 2022Today, we're gonna talk about Postgres on Kubernetes, how it has improved over the years and two improvements to Postgres operators in Kubernetes in recent times. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look! Postgres and Kubernetes In this blog post by Stefanie Stölting, she describes how Postgres on Kubernetes has improved and evolved over...
Continue reading5mins of Postgres E32: Benchmarking and improving query network latency using tc, COPY and pipeline mode
25 August, 2022Today, we're gonna talk about network latency, how it impacts your query performance and how to improve it by using pipeline mode or COPY in Postgres. We also take a look at traffic control with the Linux tc tool. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in! Improving network latency in Postgres In this blog post by Lawrence Jones, he describes...
Continue reading5mins of Postgres E31: Postgres security patch release, spotting vulnerable extensions and securing the public schema
18 August, 2022Today, we walk through the latest Postgres patch releases and a security bug that got fixed. We'll also talk about extension security in Postgres, as well as the public schema and a change in Postgres 15 to make it more secure. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. Latest PostgreSQL patches We'll start with this announcement for the new...
Continue reading5mins of Postgres E30: Postgres performance puzzle, Linux readahead and tuning effective_io_concurrency
11 August, 2022Today, we're going to talk about an interesting Postgres performance puzzle by Jeremy Schneider, Linux readahead and how to tune effective_io_concurrency. The default Linux readahead setting assumes sequential scans ahead, but you might want to change it, together with effective_io_concurrency, which tunes how much Postgres is telling the Linux Kernel to look ahead. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter...
Continue reading5mins of Postgres E29: Postgres lock conflicts on Amazon Aurora & tuning max_standby_streaming_delay and hot_standby_feedback
04 August, 2022Today, we're talking about Postgres lock conflicts on Amazon Aurora and tuning max_standby_streaming_delay and hot_standby_feedback, why you should implement retry logic with Aurora and how you can avoid query conflicts on Postgres with Amazon Aurora's fast clones. We’ll also look at an effective setup from the Twitch team for these settings. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our...
Continue reading5mins of Postgres E28: Row Level Security in Postgres, security invoker views and why LEAKPROOF functions matter
28 July, 2022Today, we're going to talk about Postgres Row Level Security, the BYPASSRLS attribute, security invoker views, and Postgres performance problems when using LEAKPROOF functions and Row Level Security policies. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. Using Postgres Row Level Security for multi-tenant databases In this blog post by Miki from...
Continue reading5mins of Postgres E27: Postgres 16 - Make subquery alias optional in FROM clause
21 July, 2022Today, we are talking about a change in Postgres 16, that removes the alias requirement for subqueries in FROM. This may seem like a very obscure problem, but it's actually a very common issue for people that either migrate to Postgres, for example from Oracle to Postgres, or who are new to Postgres and run into this error. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. ...
Continue reading5mins of Postgres E26: New in Postgres 15 - NULL handling in UNIQUE constraints & Parallel DISTINCT
14 July, 2022Today, we are talking about two new features in Postgres 15. First of all, we're going to talk about the new unique constraint NULLS NOT DISTINCT option, and then we'll talk about the SELECT DISTINCT performance improvements. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. UNIQUE constraints and NULL values First of all, in this post by Ryan...
Continue reading5mins of Postgres E25: Postgres lock monitoring, LWLocks and the log_lock_waits setting
07 July, 2022Today, we are talking about Postgres locking. When it is concerning, what lightweight locks are, as well as the log_lock_waits setting in Postgres. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. Postgres lock monitoring with the pg_locks view We'll start with this blog post by David (@pg_dwc) from the Crunchy Data team. If you want to monitor...
Continue reading5mins of Postgres E24: Tuning the Postgres statistics target & understanding selectivity for spatial indexes
30 June, 2022Today, we are talking about tuning the Postgres statistics target, why it is 100 and how that number was derived, and Postgres selectivity, specifically selectivity for spatial indexes. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. Tuning the Postgres statistics target In this blog post by Paul Ramsey from the Crunchy Data team, he talks about...
Continue reading5mins of Postgres E23: Fuzzy text search & case-insensitive ICU collations in Postgres
23 June, 2022Today, we're going to talk about fuzzy text search in Postgres with LIKE/ILIKE, trigrams, levenshtein distances, as well as case-insensitive pattern matching. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look. Fuzzy text search in Postgres First of all, in this blog post by Brendan Scullion, we're going to talk about how you can go from not so...
Continue reading5mins of Postgres E22: Reducing AWS Aurora I/O costs with table partitioning & understanding partition pruning
16 June, 2022Today, we're going to talk about using partitioning to reduce AWS Aurora I/O costs, pg_partman, and how partition pruning speeds up query performance. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in. Using Postgres partition pruning to reduce AWS Aurora I/O costs In this blog post by two engineers from the AWS team (Sami Imseih and Yahav Biran),...
Continue reading5mins of Postgres E21: Server-side backup compression with LZ4 and Zstandard in Postgres 15, and the removal of exclusive backup mode
09 June, 2022In today’s episode 21, we're going to talk about backup improvements in Postgres 15. Postgres 15 beta1 came out a couple of weeks ago and we’re looking into LZ4 and Zstandard compression, as well as the removal of the exclusive backup mode. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look. Removal of the exclusive backup mode in Postgres 15...
Continue reading5mins of Postgres E20: An important bug in Postgres 14 with REINDEX / CREATE INDEX CONCURRENTLY, and using the amcheck extension
02 June, 2022Today, we're going to talk about an important indexing bug that was discovered in Postgres 14 and how you can verify if this affects you with the amcheck extension. Share this episode: Click here to share this episode on twitter and subscribe to our YouTube channel. Transcript Let's dive in. REINDEX / CREATE INDEX CONCURRENTLY bug in Postgres 14 We'll start with this tweet by Michael Paquier that talks about REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY. If you are using Postgres 14,...
Continue readingAn automatic indexing system for Postgres: How we built the pganalyze Indexing Engine
01 June, 2022Indexing is hard. Most people who have worked with databases will agree that regardless of how you do it, creating the right indexes on your database requires thought and detailed analysis. It's not something to be taken lightly. If you don't create the right indexes, your queries will be slow, and if you create too many indexes, your writes on busy tables will be slowed down significantly. But even if you know how to do it, indexing takes a lot of effort. And this has gotten more complex with...
Continue readingA balanced approach to automatic Postgres indexing: The new version of the pganalyze Index Advisor
01 June, 2022Historically, index creation, tuning and maintenance has been the task of database administrators who had a detailed understanding of the different queries used by applications. The fast-moving pace of modern application development, combined with a move to the cloud, has shifted the responsibility of indexing to application developers - without giving them the right tools. Application developers today spend a lot of time manually creating indexes for their Postgres queries, reviewing database...
Continue reading5mins of Postgres E19: Speeding up sort performance in Postgres 15
26 May, 2022Today, we're going to talk about four specific improvements made in Postgres 15 that help speeding up sort performance: Sorting a single columnReducing memory consumption by using a different memory allocatorNew specialized sort routines for common data typesReplacing the algorithm that's used for splitting up sorts to disk when they exceed the "work_mem". Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and...
Continue readingDesign at pganalyze: How we use design principles to develop new features and products
24 May, 2022At pganalyze, we know attention and time is very valuable and being respectful of it will allow everyone on the team to think deeply and do high-quality work. That’s why we run only a few standing meetings. It’s one actually. A weekly meeting on Tuesdays. Most of our active working time goes into product development, infrastructure improvements, and working with customers. Apart from that, it is on each team member to organize and schedule working sessions, support triages, or workshops. We are...
Continue reading5mins of Postgres E18: Partition pruning, prepared statements and generic vs custom query plans
19 May, 2022In today’s episode 18, we'll talk about partition pruning and prepared statements in Postgres and run a performance test using the pgbench tool. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in. Performance of prepared statements with partitioning in Postgres In this blog posts by Amit from the EDB team, he talks about how prepared statements and...
Continue reading5mins of Postgres E17: Demystifying Postgres for application developers: A mental model for tables and indexes
12 May, 2022Today, we're going to talk about demystifying database performance for developers. This blog post by Christopher Winslett talks about how to think about your database as an application developer and what kind of mental model you can have to think effectively about how to use indexes and other data structures in your database. We will talk about Postgres Tables, Indexes, Index Cardinality, Table scans, and more! Share this episode: Click here to share this episode on twitter, or sign up for our...
Continue reading5mins of Postgres E16: Incremental Materialized Views with pg_ivm and Looking Ahead to Postgres 15 Beta 1
05 May, 2022Today, we're going to talk about incremental view maintenance, as well as the upcoming Postgres 15 beta release. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's take a look! Using Incremental Materialized Views with pg_ivm So, first of all, this is a new extension released by the SRA OSS team in Japan, which is called pg_ivm. pg_ivm is an extension you...
Continue readingThe Unexpected Find That Freed 20GB of Unused Index Space
04 May, 2022How to free space without dropping indexes or deleting data At pganalyze, we've been a fan of Haki Benita's writing for quite some time. This article back from February 2021 is worth a read. Haki kindly gave us permission to republish his post on the pganalyze blog so you could benefit from it. Haki is a software developer and a technical lead who takes special interest in databases, web development, software design and performance tuning. You can find the original article on his blog at...
Continue reading5mins of Postgres E15: Tuning Postgres' Deterministic Query Planner, Extended Statistics and Join Collapse Limits
28 April, 2022Today, we'll talk about optimizing with the Postgres deterministic query planner, have a look at using extended statistics, and show how to tune Join collapse times. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in! Optimizing with Postgres’ deterministic query planner In this blog post by Alex Bartlow from the engineering team at Aha! he is talking...
Continue reading5mins of Postgres E14: HOT Updates vs Bottom-Up Index Deletion in Postgres 14
21 April, 2022Today, we're going to talk about bottom-up index deletion for B-tree indexes, Postgres pages, Postgres page splits in B-Tree indexes, the bt_page_items function, and how Bottom-Up Index Deletions compare to HOT updates. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look! Comparing HOT Updates and Bottom-Up Index Deletion First of all, let's take a...
Continue reading5mins of Postgres E13: New SQL/JSON and JSON_TABLE features in Postgres 15
14 April, 2022In today's video, we take a look at the SQL/JSON features in Postgres 15. Postgres 15 will most likely be released in September or October, so it's still quite a way until we can actually use this on a production server. However, a couple of days ago, the Postgres 15 feature freeze occurred, which means we now have a good sense for which functionality will make it into this next Postgres release. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and...
Continue reading5mins of Postgres E12: The basics of tuning VACUUM and autovacuum
07 April, 2022Today, we're going to talk about tidying up with VACUUM. Now, as the Postgres Weekly newsletter put it, we can't go a month without covering VACUUM in some way or another. If you recall a few weeks ago, we last talked about "dead tuples not yet removable" and what is essentially a very specific VACUUM problem that causes VACUUM to stall and not make progress. You can find the episodes about this here: 5mins of Postgres E7: Autovacuum, dead tuples not yet removable, and the Postgres xmin horizon...
Continue readingHow Postgres Chooses Which Index To Use For A Query
01 April, 2022Using Postgres sometimes feels like magic. But sometimes the magic is too much, such as when you are trying to understand the reason behind a seemingly bad Postgres query plan. I've often times found myself in a situation where I asked myself: "Postgres, what are you thinking?". Staring at an EXPLAIN plan, seeing a Sequential Scan, and being puzzled as to why Postgres isn't doing what I am expecting. This has led me down the path of reading the Postgres source, in search for answers. Why is...
Continue readingPostgres INSERT ON CONFLICT and how it compares to MERGE in Postgres 15
31 March, 2022Today, in episode 11 of our series, we are taking a look at the MERGE command and how it differs from Postgres' INSERT ON CONFLICT command. Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's get started. MERGE in Postgres 15 Earlier this week, the MERGE command was merged into the Postgres 15 branch. MERGE, for those of you who are not familiar with it, is a...
Continue reading5mins of Postgres E10: max_wal_size, Postgres full page writes and UUID vs BIGINT primary keys
24 March, 2022In today’s episode 10, we'll take a look at max_wal_size, full-page writes and why you might want to choose BIGINT instead of a UUID for your primary key column. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in! Tuning max_wal_size in PostgreSQL In this blog post by Dave Page, he talks about the max_wal_size parameter. He talks about how they've gotten up to 10...
Continue reading5mins of Postgres E9: PostGIS vs GPUs, tuning parallel queries, and automating Citus extension benchmarks with HammerDB
17 March, 2022In this episode, we'll talk about PostGIS versus GPU performance for spatial queries and also take a look at using HammerDB and custom automation scripts to benchmark the Citus extension. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive in! Looking at PostGIS vs. GPU for Performance and Spatial Joins In this blog post by Paul Ramsey, he compares PostGIS performance...
Continue reading5mins of Postgres E8: Postgres Auditing: Table triggers with supabase vs the pgAudit extension
10 March, 2022Today, we'll take a look at Postgres auditing. We'll look at two different approaches: Using triggers in PostgresUsing the pgAudit extension Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's jump in! Using triggers in Postgres for Postgres Auditing Today’s episode was inspired by a blog post by Oliver Rice on the supabase blog, where he talks about Postgres auditing in 150...
Continue reading5mins of Postgres E7: Autovacuum, dead tuples not yet removable, and the Postgres xmin horizon
03 March, 2022Today, we'll talk about autovacuum, dead tuples not yet removable, and what the xmin horizon in Postgres does. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's get started! Vacuuming dead tuples not yet removable in Postgres We'll start with this blog post by Mathieu Tamer from the Precogs team back from 2018. The reason I looked at this blog post again is because it was...
Continue reading5mins of Postgres E6: Optimizing Postgres Text Search with Trigrams and GiST indexes
24 February, 2022Today, we're looking at optimizing Postgres text search with trigrams and take a closer look at GiST indexes. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's have a look together! Improving Postgres text search performance with trigrams Trigrams in general are a way of splitting up a string, like "hello" for example, into individual three-character pieces that are easier...
Continue reading5mins of Postgres E5: Finding the root cause of high CPU utilization in Postgres with EXPLAIN and the Linux perf command
17 February, 2022Today, we'll take a look at a hairy Postgres incident and how we can use Postgres EXPLAIN and the Linux perf command to debug high CPU utilization in Postgres. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's start diving in. Debugging a 100% CPU utilization spike with EXPLAIN and perf In this blog post by Jeremy Schneider, he describes a situation he encountered at AWS...
Continue reading5mins of Postgres E4: Writing your own custom Postgres aggregates with Rust or raw SQL
10 February, 2022Today we'll take a look at how to write custom aggregates in Postgres using either Rust or raw SQL. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's get started! Using pgx to write a Rust extension for Postgres that runs custom aggregate code We'll start with this article by Ana on the Hoverbear blog which describes how to use the new pgx release to write a Rust extension...
Continue reading5mins of Postgres E3: Postgres performance cliffs with large JSONB values and TOAST
03 February, 2022Today, we're going to talk about JSONB and how JSONB performance can behave badly when the JSONB value exceeds the two kilobyte limit where Postgres starts storing data in TOAST. Share this episode: Click here to post to twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel. Transcript Let's dive right in! How Postgres stores data: Large JSON value query performance Now, for those of you who don't know, JSONB is a binary version of the...
Continue reading5mins of Postgres E2: Using unnest(..), generate_series(), and PostGIS
27 January, 2022Today, we're gonna look at a few interesting aspects on how to work with data sets in Postgres. First of all, we'll take look at using UNNEST for performing bulk operations on a larger number of values. Second, we'll take a look at generate_series and how to generate sample data if you do not have large amounts of input data. Last, but not least, we'll take a look at a PostGIS community example, where somebody had a particular challenge, indexing PostGIS data, and we'll take a look at how a...
Continue reading5mins of Postgres E1: Using Postgres statistics to improve bad query plans, pg_hint_plan extension
20 January, 2022Hello, and welcome to 5 minutes of Postgres! Today, we are starting a new weekly video series where we'll walk through interesting articles about Postgres from the last week, as well as evergreen blog posts from the past, and provide context and our personal perspective on it. We are planning to publish a new episode of this series every Thursday, so keep an eye out! We'll always embed the video first, but below it you can find a transcription in case you want to read through what we discussed...
Continue readingPostgres in 2021: An Observer's Year In Review
07 January, 2022Every January, the pganalyze team takes time to sit down to reflect on the year gone by. Of course, we are thinking about pganalyze, our customers and how we can improve our product. But, more importantly, we always take a bird's-eye view at what has happened in our industry, and specifically in the Postgres community. As you can imagine: A lot! So we thought: Instead of trying to summarize everything, let's review what happened with the Postgres project, and what is most exciting from our...
Continue readingThe Fastest Way To Load Data Into Postgres With Ruby on Rails
14 December, 2021Data migration is a delicate and sometimes complicated and time-consuming process. Whether you are loading data from a legacy application to a new application or you just want to move data from one database to another, you’ll most likely need to create a migration script that will be accurate, efficient, and fast to help with the process — especially if you are planning to load a huge amount of data. There are several ways you can load data from an old Rails app or other application to Rails....
Continue readingUnderstanding Postgres GIN Indexes: The Good and the Bad
02 December, 2021Adding, tuning and removing indexes is an essential part of maintaining an application that uses a database. Oftentimes, our applications rely on sophisticated database features and data types, such as JSONB, array types or full text search in Postgres. A simple B-tree index does not work in such situations, for example to index a JSONB column. Instead, we need to look beyond, to GIN indexes. Almost 15 years ago to the dot, GIN indexes were added in Postgres 8.2, and they have since become an...
Continue readingPostgres Views in Django
16 November, 2021At my first job, we worked with a lot of data. I quickly found that when there's a lot of data, there are bound to be some long, convoluted SQL queries. Many of ours contained multiple joins, conditionals, and filters. One of the ways we kept the complexity manageable was to create Postgres views for common queries. Postgres views allow you to query against the results of another query. Views can be composed of columns from one or more tables or even other views, and they are easy to work with...
Continue readingHow we deconstructed the Postgres planner to find indexing opportunities
02 November, 2021Everyone who has used Postgres has directly or indirectly used the Postgres planner. The Postgres planner is central to determining how a query gets executed, whether indexes get used, how tables are joined, and more. When Postgres asks itself "How do we run this query?”, the planner answers. And just like Postgres has evolved over decades, the planner has not stood still either. It can sometimes be challenging to understand what exactly the Postgres planner does, and which data it bases its...
Continue readingA better way to index your Postgres database: pganalyze Index Advisor
23 September, 2021When you run an application with a relational database attached, you will no doubt have encountered this question: Which indexes should I create? For some of us, indexing comes naturally, and B-tree, GIN and GIST are words of everyday use. And for some of us it’s more challenging to find out which index to create, taking a lot of time to get right. But what unites us is that creating and tweaking indexes is part of our job when we use a relational database such as Postgres in production. We...
Continue readingUsing Postgres CREATE INDEX: Understanding operator classes, index types & more
12 August, 2021Most developers working with databases know the challenge: New code gets deployed to production, and suddenly the application is slow. We investigate, look at our APM tools and our database monitoring, and we find out that the new code caused a new query to be issued. We investigate further, and discover the query is not able to use an index. But what makes an index usable by a query, and how can we add the right index in Postgres? In this post we’ll look at the practical aspects of using the...
Continue readingEfficient Pagination: PostgreSQL and Django
20 July, 2021You could say most web frameworks take a naive approach to pagination. Using PostgreSQL’s COUNT, LIMIT, and OFFSET features for pagination works fine for the majority of web applications, but if you have tables with a million records or more, performance degrades quickly. Django is an excellent framework for building web applications, but its default pagination method falls into this trap at scale. In this article, I’ll help you understand Django’s pagination limitations and offer three...
Continue readingPostgreSQL Partitioning in Django
08 July, 2021Postgres 10 introduced partitioning to improve performance for very large database tables. You will typically start to see the performance benefits with tables of 1 million or more records, but the technical complexity usually doesn’t pay off unless you’re dealing with hundreds of gigabytes of data. Though there are several advantages to partitioning, it requires more tables, which can become cumbersome to work with, especially if you change your data structure in the future. Please note: If...
Continue readingGeoDjango and PostGIS in Django
24 June, 2021In this article, I’ll introduce you to spatial data in PostgreSQL and Django. You’ll learn how to use PostGIS and GeoDjango to create, store, and manipulate geographic data (both raster and vector) in a Python web application. Spatial data is any geographic data that contains information related to the earth, such as rivers, boundaries, cities, or natural landmarks. It describes the contours, topology, size, and shape of these features. Maps are a common method of visualizing spatial data,...
Continue readingUsing Postgres Row-Level Security in Ruby on Rails
25 May, 2021Securing access to your Postgres database is more important than ever. With applications growing more complex, often times using multiple programming languages and frameworks within the same app, it can be challenging to ensure access to customer data is handled consistently. For example, if you are building a SaaS application where different companies use the application, you don't want users of Company A to see the data of users in Company B by accident. Sure, you could use create a separate...
Continue readingA look at Postgres 14: Performance and Monitoring Improvements
21 May, 2021The first beta release of the upcoming Postgres 14 release was made available yesterday. In this article we'll take a first look at what's in the beta, with an emphasis on one major performance improvement, as well as three monitoring improvements that caught our attention. Before we get started, I wanted to highlight what always strikes me as an important unique aspect of Postgres: Compared to most other open-source database systems, Postgres is not the project of a single company, but rather...
Continue readingCreating Custom Postgres Data Types in Rails
22 April, 2021Postgres ships with the most widely used common data types, like integers and text, built in, but it's also flexible enough to allow you to define your own data types if your project demands it. Say you're saving price data and you want to ensure that it’s never negative. You might create a not_negative_int type that you could then use to define columns on multiple tables. Or maybe you have data that makes more sense grouped together, like GPS coordinates. Postgres allows you to create a type...
Continue readingIntroducing pg_query 2.0: The easiest way to parse Postgres queries
18 March, 2021Parsing of "SELECT * FROM mytable" SQL statement into the associated Postgres parse tree The query parser is a core component of Postgres: the database needs to understand what data you're asking for in order to return the right results. But this functionality is also useful for all sorts of other tools that work with Postgres queries. A few years ago, we released pg_query to support this functionality in a standalone C library. pganalyze uses pg_query to parse and analyze every SQL query that...
Continue readingEfficient Postgres Full Text Search in Django
24 February, 2021In this article, we'll take a look at making use of the built-in, natural language based Postgres Full Text Search in Django. Internet users have gotten increasingly discerning when it comes to search. When they type a keyword into your website's search bar, they expect to find logically ranked results, including related matches and misspellings. Because users are used to these sophisticated search systems, developers have to build applications that use more than simple LIKE queries. Comparison...
Continue readingCreating Custom Postgres Data Types in Django
15 December, 2020Postgres allows you to define custom data types when the default types provided don't fit your needs. There are many situations where these custom data types come in handy. For example, if you have multiple columns in several tables that should be an int between 0 and 255, you could use a custom data type so that you only have to define the constraints once. Or, if you have complex data - like metadata about a file - and you want to save it to a single column instead of spreading it across...
Continue readingPostGIS vs. Geocoder in Rails
01 October, 2020This article sets out to compare PostGIS in Rails with Geocoder and to highlight a couple of the areas where you'll want to (or need to) reach for one over the other. I will also present some of the terminology and libraries that I found along the way of working on this project and article as I set out to understand PostGIS better and how it is integrated with Rails. If you are interested in learning how to work with geospatial data with PostGIS in Django I recommend having a look at our blog...
Continue readingLessons Learned from Running Postgres 13: Better Performance, Monitoring & More
21 September, 2020Postgres 13 is almost here. It's been in beta since May, and the general availability release is coming any day. We've been following Postgres 13 closely here at pganalyze, and have been running the beta in one of our staging environments for several months now. There are no big new features in Postgres 13, but there are a lot of small but important incremental improvements. Let's take a look. Performance Postgres 13 performance improvements include both built-in optimizations and heuristics...
Continue readingUsing Postgres Row-Level Security in Python and Django
13 August, 2020Postgres introduced row-level security in 2016 to give database administrators a way to limit the rows a user can access, adding an extra layer of data protection. What's nice about RLS is that if a user tries to select or alter a row they don't have access to, their query will return 0 rows, rather than throwing a permissions error. This way, a user can use select * from table_name, and they will only receive the rows they have access to with no knowledge of rows they don't. Most examples of...
Continue readingPostgres JSONB Fields in Django
30 July, 2020I remember the first time I built user preferences into an app. At first, users just needed to be able to opt in or out of our weekly emails. "No big deal," I thought, "I'll just add a new field on the Users table." For a while, that was fine. A few weeks later, my boss asked me if we could let users opt into push notifications. Fine, that's just one more column on the database. Can't hurt, right? You probably see where this is going. Within months, my user table had 40 columns, and while...
Continue readingBuilding SVG Components in React
09 July, 2020React is well known as a great tool for building complex applications from HTML and CSS, but that same approach can also be used with SVG to build sophisticated custom UI elements. In this article, we'll give a brief overview of SVG, when to use it (and when not to), and how to use it effectively in a React application. We'll also briefly touch on how to integrate with d3 (which comes in very useful when working with SVG). React SVG example We relied heavily on SVG to build the charting updates...
Continue readingAdvanced Active Record: Using Subqueries in Rails
24 June, 2020Active Record provides a great balance between the ability to perform simple queries simply, and also the ability to access the raw SQL sometimes required to get our jobs done. In this article, we will see a number of real-life examples of business needs that may arise at our jobs. They will come in the form of a request for data from someone else at the company, where we will first translate the request into SQL, and then into the Rails code necessary to find those records. We will be covering...
Continue readingIntroducing New Charts & Date Picker in pganalyze
29 April, 2020Clear and flexible presentation of data is the bread and butter of a monitoring service. A good one will display the right data, but a great one can guide you toward meaningful insights. Visual representation of data in a clear and concise way can help you make decisions quickly. Today we're releasing multiple updates to pganalyze that will help you get to insights more effectively, and keep your database running smoothly. # This code block gets replaced with the TOC Date range selection as a...
Continue readingFull Text Search in Milliseconds with Rails and PostgreSQL
16 April, 2020Imagine the following scenario: You have a database full of job titles and descriptions, and you’re trying to find the best match. Typically you’d start by using an ILIKE expression, but this requires the search phrase to be an exact match. Then you might use trigrams, allowing spelling mistakes and inexact matches based on word similarity, but this makes it difficult to search using multiple words. What you really want to use is Full Text Search, providing the benefits of ILIKE and trigrams,...
Continue readingEffectively Using Materialized Views in Ruby on Rails
16 January, 2020It's every developer's nightmare: SQL queries that get large and unwieldy. This can happen fairly quickly with the addition of multiple joins, a subquery and some complicated filtering logic. I have personally seen queries grow to nearly one hundred lines long in both the financial services and health industries. Luckily Postgres provides two ways to encapsulate large queries: Views and Materialized Views. In this article, we will cover in detail how to utilize both views and materialized views...
Continue readingIntroducing Automated Postgres EXPLAIN Visualization & Insights
16 December, 2019Today, we’re excited to introduce you to the next evolution of pganalyze. We updated our logo and overall brand, worked on our documentation to help you understand Postgres and its internals better and, most importantly, we’re proud to announce a new key feature on our platform: Automated EXPLAIN Visualization & Insights. # This code block gets replaced with the TOC Offering this functionality to you is a natural progression for us as we’re further refining our focus on providing you with even...
Continue readingSimilarity in Postgres and Rails using Trigrams
19 November, 2019Trigram Example You typed "postgras", did you mean "postgres"? Use the best tool for the job. It seems like solid advice, but there's something to say about keeping things simple. There is a training and maintenance cost that comes with supporting an ever growing number of tools. It may be better advice to use an existing tool that works well, although not perfect, until it hurts. It all depends on your specific case. Postgres is an amazing relational database, and it supports more features...
Continue readingEfficient GraphQL queries in Ruby on Rails & Postgres
24 September, 2019GraphQL puts the user in control of their own destiny. Yes, they are confined to your schema, but beyond that they can access the data in any which way. Will they ask only for the "events", or also for the "category" of each event? We don't really know! In REST based APIs we know ahead of time what will be rendered, and can plan ahead by generating the required data efficiently, often by eager-loading the data we know we'll need. In this article, we will discuss what N+1 queries are, how they...
Continue readingPostgres Connection Tracing, Wait Event Analysis & Vacuum Monitoring go into GA on pganalyze
14 April, 2019We’re excited to announce the general availability of three new pganalyze features: Connection Tracing, Wait Event Analysis, as well as Vacuum Monitoring. These features have been developed based on the feedback of hundreds of customers monitoring their production Postgres databases using pganalyze. Thanks so much for consistently taking the time to provide us with valuable information on how you’d like to see pganalyze evolve! Postgres Connection Tracing & Wait Event Analysis One of the most...
Continue readingPostgres 11: Monitoring JIT performance, Auto Prewarm & Stored Procedures
04 October, 2018Everyone’s favorite database, PostgreSQL, has a new release coming out soon: Postgres 11 In this post we take a look at some of the new features that are part of the release, and in particular review the things you may need to monitor, or can utilize to increase your application and query performance. JIT Performance in Postgres 11 Just-In-Time compilation (JIT) in Postgres 11 Just-In-Time compilation (JIT) for query execution was added in Postgres 11. It's not going to be enabled for queries...
Continue readingPostgres Log Monitoring with pganalyze: Introducing Log Insights 2.0
24 July, 2018TLDR: We recently released substantial improvements to our Log Insights feature, including up to 30 day history, support for Heroku Postgres, as well as support for monitoring the log files of PostgreSQL servers running on-premise. How pganalyze parses Postgres log files Its now been a bit over a year since we first released the log monitoring functionality in pganalyze, and we would like to share a major update with you today. Before diving in, a quick review how the pganalyze collector works...
Continue readingPostgres Log Monitoring 101: Deadlocks, Checkpoint Tuning & Blocked Queries
12 February, 2018Those of us who operate production PostgreSQL databases have many jobs to do - and often there isn't enough time to take a regular look at the Postgres log files. However, often times those logs contain critical details on how new application code is affecting the database due to locking issues, or how certain configuration parameters cause the database to produce I/O spikes. This post highlights three common performance problems you can find by looking at, and automatically filtering your...
Continue readingVisualizing & Tuning Postgres Autovacuum
28 November, 2017VACUUM timeline visualization In this post we'll take a deep dive into one of the mysteries of PostgreSQL: VACUUM and autovacuum. The Postgres autovacuum logic can be tricky to understand and tune - it has many moving parts, and is hard to understand, in particular for application developers who don't spend all day looking at database documentation. But luckily there are recent improvements in Postgres, in particular the addition of pg_stat_progress_vacuum in Postgres 9.6, that make...
Continue readingWhats New in Postgres 10: Monitoring Improvements
04 October, 2017Postgres 10 has been stamped on Monday, and will most likely be released this week, so this seems like a good time to review what this new release brings in terms of Monitoring functionality built into the database. In this post you'll see a few things that we find exciting about the new release, as well as some tips on what to adjust, whether you use a hosted Postgres monitoring tool like pganalyze, or if you've written your own scripts. New "pg_monitor" Monitoring Role Most users of Postgres...
Continue readingIntroducing Log Insights: Realtime Analysis of Postgres Logs
07 June, 2017After significant development effort, we're excited to introduce you to a new part of pganalyze that we believe every production Postgres database needs: pganalyze Log Insights UPDATE: We released pganalyze Log Insights 2.0 - read more about it in our article: Postgres Log Monitoring with pganalyze: Introducing Log Insights. In the past you used generic log management systems and setup your own filtering and altering rules, which required a lot of manual effort, as well as knowledge of all...
Continue readingMonitoring PostgreSQL 9.5 & Improved Weekly Reports
06 July, 2015Last week the first official alpha version of PostgreSQL 9.5 was released. Whilst the stable release is still 2-3 months away, now is a good time to review what is upcoming, and which changes and improvements we can expect. Here is an overview of the most important changes for monitoring tools: pg_stat_statements gets new columns min_time, max_time, mean_time & stddev_time - making it much easier to identify outliers in the query statisticsNew pg_stat_ssl view that shows active SSL connections...
Continue readingIntroducing pg_query: Parse PostgreSQL queries in Ruby
17 June, 2014In this article we'll take a look at the new pg_query Ruby library. pg_query is a Ruby library I wrote to help you parse SQL queries and work with the PostgreSQL parse tree. We use this extension inside pganalyze to provide contextual information for each query and find columns which might need an index. At the end of this article you'll also find monitor.rb - a ready-to-use example that filters pg_stat_statements output and restricts it to only show a specific table. Existing Solutions to...
Continue readingAnnouncing The All-New Database Check-Up
20 March, 2014We’ve just launched our new version of Database Check-Up - allowing you to see more quickly what could be relevant to look at in your database. New Database Check-Up In addition we’ve also revamped the detail pages of queries, tables, indices and config settings to match the new style: New Database Check-Up: Query New Database Check-Up: Table New Database Check-Up: Index Improved Check-Up: Config Settings When working with other people's PostgreSQL databases, we’ve seen a lot of things, from...
Continue reading