5mins of Postgres E33: Postgres on Kubernetes, choosing the right operator, and handling major version upgrades
Today, we're gonna talk about Postgres on Kubernetes, how it has improved over the years and two improvements to operators in recent times.
Let's have a look!
In this blog post by Stefanie Stölting, she describes how Postgres on Kubernetes has improved and evolved over the years.
You may remember times when Kubernetes didn't have permanent storage and it was not a good choice to run a database in Kubernetes, because if your pod moves to another server, maybe that means that suddenly your database disappears and your data is lost. The good news is that Kubernetes has improved a lot in the last couple of years. Persistent volumes are a very common sight in a Kubernetes cluster today.
Persistent volumes are a very common sight in a Kubernetes cluster today.
It's not like you need a lot of things to run Postgres in the simplest form in Kubernetes. But if you do that, you don't get backups, you don't get easy configuration, you don't get extensions and such. The role that operators play is they make it easier for you, they make it something that you can actually use in production.
In her article, Stefanie describes the different operators that exist today in the Postgres community, starting with the CloudNativePG operator. This is a more recent entrance to the Postgres on Kubernetes space, developed by EDB, but now contributed to the Cloud Native Foundation, developed as an open source project in that community.
The two operators that have existed for a long time are the PGO operator by Crunchy Data and the postgres-operator by Zalando. If you were to deploy Postgres on Kubernetes in production today, I would focus on these two.
Stefanie has done a very helpful feature comparison between the different operators:
- Which Postgres version is supported?
- How does it support high availability?
- How are backups done?
- Which connection pooler is in use?
- Are extensions and major version upgrades supported?
Stefanie concludes that even if you have these operators you still have to adjust the Postgres configuration settings.
A lot of these operators help you integrate Postgres with Kubernetes, but they don't tune Postgres for you. For example, you may have to tune WAL settings if you're running a
pgbench or you have to watch out when you're updating the
I did want to take a closer look at two recent improvements to these operators.
First of all, I wanted to look at what the team at EDB has been up to in regards to their CloudNativePG operator. Gabriele Bartolini wrote about how you can now import data into Postgres on Kubernetes in this article.
What they've added is when you're bootstrapping a new database server you can now specify an import statement. You can say: please import this database "freddie", for example, from this external cluster source. What this can do is it connects to your cluster, it does a
pg_dump, and it imports the data into the CloudNativePG operator managed Postgres. In Gabriele’s example he uses an RDS database, but then he’s importing that into an Kubernetes cluster. Pretty straightforward, I think this could be very useful if you're using Kubernetes in your staging or QA environments.
The other article I found interesting was this post by Andrew from the Crunchy Data team, where he describes how the PGO operator now makes it easy to do major version upgrades. This is actually pretty cool. I think this shows pretty well why an operator can be a lot more sophisticated than a simple pod. In his post, Andrew describes their new PGUpgrade API that is now part of the PGO operator.
The way this works is when you want to do a major version upgrade, from Postgres 13 to 14, for example, you create a new resource and this is a special custom resource in Kubernetes called
PGUpgrade. You can tell it:
- This is my cluster that I want to upgrade
- This is the Postgres version I'm upgrading from
- and this is the Postgres version I'm upgrading to.
This is important because when you run the
pg_upgrade command in Postgres, you need to have access to both the old binaries and the new binaries of these Postgres versions.
When you run the
pg_upgradecommand in Postgres, you need to have access to both the old binaries and the new binaries of these Postgres versions.
It's not a matter of just starting up the new version. You do actually need to do a lot of handling around this.
pg_upgrade is pretty fast behind the scenes. I think this is as smooth as you would get it. Maybe even smoother than you would get it on a managed database as a service!
The Crunchy Operator clearly has a lot of sophistication built into it, which would give me confidence if I were to deploy it in production.