Introducing Query Tuning Workbooks to safely tune Postgres queries on production with pganalyze!

Postgres connection pooling: Comparing PgCat and pgbouncer

In episode 60 of “5mins of Postgres” we’re talking about PgCat, a new connection pooler for Postgres, and how it compares to pgbouncer, which a lot of folks currently use.



Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel.


Transcript

Let's get to it!

PgCat - A new Postgres connection pooler

First, I encourage you to read through the original Hacker News post by Lev from the PostgresML team where he announced PgCat a little bit over a year ago, and he announced it as a way of taking pgbouncer to the next level.

This is an early project - it was literally started a year ago - but it does show some early promises. Essentially, it is a Rust-based connection pooler that tries to be compatible with pgbouncer, including some of the management APIs. And it extends it beyond pgbouncer's capabilities, for example it features load balancing as well as some sharding options.

It extends it beyond pgbouncer's capabilities, for example it features load balancing as well as some sharding options.

PgCat itself does describe itself as beta. I would be careful using it in production, and I would expect certain things to not work or error out. Now, what is very promising and has me personally excited is that the team at Instacart actually has started adopting PgCat!

How Instacart adopted PgCat for Postgres connection pooling

This is a post by Mostafa, Zain and Andrew from the Instacart team where they're describing how they adopted PgCat.

They use Postgres for a lot of things at Instacart, and they are squeezing out a lot of performance out of individual Postgres instances. But oftentimes the only way to scale is to do things like having additional read replicas or also doing sharding on some of the databases. In their article, they mainly talk about load balancing for read replicas and other benefits of using PgCat.

Postgres query load balancing in Ruby on Rails

In the case of read replicas, you do have to have some awareness of how many replicas there are. Let's say you have five read replicas, your application actually needs to know how to make use of them and needs to load balance read query traffic between them.

Instacart has actually solved this at the application side historically. They have written a library called Makara, which is a Ruby library that handles this inside Rails, inside ActiveRecord, and does load balancing and failover. Part of the motivation here is to have a cross-language solution to this load balancing and failover problem.

Comparing pgbouncer, Odyssey and PgCat

The team at Instacard has done a comparison between different options. The four options they've evaluated were

  1. using pure pgbouncer
  2. using pgbouncer with a side car in a container to handle the failover situation
  3. using Yandex's Odyssey project
  4. using PgCat.

The big benefit of both PgCat and Odyssey is that they're multi-threaded. If you use pgbouncer at scale, you very likely have run into the problem that pgbouncer is ultimately a single-threaded asynchronous connection based pooler.

It is pretty good at handling a lot of connections, but if you have a lot of activities for individual connections, that's where you'll run into bottlenecks. Both Odyssey and PgCat don't have this issue.

All of them support the pgbouncer control commands, like reloading, showing pool information and such. This is really useful, you can use the pgbouncer control language as a universal language across all these.

The main benefit of PgCat is that it handles replica failovers and it also has improvements around poorly behaved clients.

The big benefit of both PgCat and Odyssey is that they're multi-threaded.

How Rust is allowing for PgCat's stability and security

And last but not least, it is written in Rust compared to all other options here, which are written in C or C++.

Rust's big benefit is that it has increased memory safety. It's much easier to trust a new service that's written in Rust. "Will it corrupt memory", "Will it have security bugs?", there's less likelihood of those. I think that's part of why they felt it was worth pursuing this direction.

Latency - PgCat vs pgbouncer

The first thing they ran was a comparison in terms of latency. pgbouncer is pretty fast because pgbouncer is written in C – it's not that complicated. Really, what they were measuring is whether PgCat has overhead. Is it slower than pgbouncer? The good news is that in their production test, they've seen that this was not a problem. It's slightly slower, but that's in the order of microseconds in the most part.

Overall, this gives me confidence that PgCat is ready to serve production traffic from a query latency perspective.

Download Free eBook: How To Get 3x Faster Postgres

Using PgCat in production for Postgres connection pooling

They've also done additional tests on correctness. That's where they've contributed some fixes, they, I think, are the first big production use case for PgCat, and so obviously a lot of things were not in place yet, but are now in place. And so the good news is that they're now at a point where PgCat can handle their production workloads without issues.

Overall, I think Instacart is pretty positive with this. They've clearly run this with a lot of thought and have done a lot of testing. If I were considering whether to try out PgCat, this would give me confidence to try it out.

I do think it's still a bit of a moving target, but PgCat is targeting a 1.0 release in the next coming weeks. Once the 1.0 is out, combined with Instacart's production testing, this gives a very good starting point for this to become a new mainstream connection pooler that a lot of people in Postgres will use.

Thanks for checking out E60 of 5mins of Postgres. Subscribe to our YouTube channel, sign up for our newsletter and follow us on LinkedIn and Twitter to get updates about new episodes!

What we have discussed in this episode of 5mins of Postgres


Enjoy blog posts like this?

Get them once a month to your inbox