UUIDs vs Serial for Primary Keys - what's the right choice?
In E59 of “5mins of Postgres” we’re talking about UUIDs vs Serials for primary keys. In our last episode, we talked about what happens if you use a regular four byte integer for your primary key and you then run out of space. Today, I want to talk about the scenario where you have made the right choice to use bigints in your system, but you're trying to decide whether you want to use UUIDs going forward. Or the other way around, if you are currently using UUIDs, and you're unsure if that's the right choice - this episode is for you.
Let's dive in!
I want to start with this blog post by Christophe Pettus from the PG Experts team. In it, Christophe mentions that if you're asking yourself: “should a UUID or a bigint be the primary key?” what you’re really doing is conflating two different concepts. What he's suggesting is that instead you should look at this problem by asking: “should our keys be random or sequential?” That's the first question to answer. Only after that you should consider if your keys should be 64 bits or larger. UUIDs are 128 bits, whereas a bigint would be 64 bits.
Now, let's start by looking at his question: random or sequential?
Randomness can be quite helpful. For example, a random value is harder to do an enumeration type attack on. Let's say you have an
orders table in your database and you're listing your orders in your URL scheme. If you have "/order/1", then people will know that they're the first order, the first person who bought something from your store, for example. That's maybe bad for business because people think you're too small. It also has the downside that if there is a potential attack to get other people's data, it makes it easier to discover and try that out because the value of the next record is known.
Second, it's much easier to handle random keys if you're looking at all your records in your whole database, and you're just trying to find something. It's very easy to make sure that you are just hitting that specific record, so randomness can be helpful to identify things uniquely.
It's very easy to make sure that you are just hitting that specific record, so randomness can be helpful to identify things uniquely.
One thing that's not mentioned in Christophe’s article is that if you're using any kind of distributed database, for example if you're using the Citus extension for Postgres (even though Citus has ways to do sequential sequences across multiple distributed nodes), it is much easier to handle this with a random number. If you're using distributed systems, there's another argument to have something that's at least semi random.
There are a couple of reasons why you would want to have a sequential key. Sequential keys are faster to generate, for example. A sequential key doesn't require access to a random number generator, you just need to increment a single field. That's pretty fast on most machines.
Further, sequential keys interact much better with B-tree indexes. If you have a B-tree index, and if you're using Postgres probably most of your indexes are going to be B-tree indexes, you will see that a random value interacts really badly with how the B-tree index gets structured. You will see index bloat if you're using random keys. This is a strong argument for sequential keys, if you can use them.
A random value interacts really badly with how the B-tree index gets structured. You will see index bloat if you’re using random keys.
The other thing that Christophe talks about is that this is not necessarily a discussion of “bigint versus UUID”. It's also a discussion about the size of the data type. There is an argument to be made that a 128 bit value might not be a good choice, because Postgres, on a typical 64 bit system, is going to keep things in a simple "Datum" type if it's 64 bits, but if it's 128 bits, it has to do a little bit more memory management to handle those. This means there is overhead to 128 bits compared to 64 bits.
In another post, Brandur from the Crunchy Data team references a benchmark that 2ndquadrant ran a few years ago. You can easily reproduce this yourself on any modern Postgres as well if you’d like.
In the example they did they either assigned values for an index at random, or they assigned values using time or sequentially. When you have a look at Brandur’s blog post you can see a blue bar that shows you why randomness is bad for performance. Especially on larger datasets, performance will quickly tank if you're using a random distribution. That's essentially because there's too much randomness in all the new indexed values. This causes the index to bloat. Similarly, you'll see that the WAL size of such random updates is significantly larger. That can be explained by index updates in Postgres, when there is no more space in a B-tree page, having to do a page split. The page split can cause a bigger portion of the index to be updated than just the individual index entry. This causes downstream impact on the WAL.
Now, Brandur talks a bit more about the hybrid approach. First of all, he points out the "sequential-uuids" extension that 2ndquadrant had published at the time. You can use this extension to either do
- a "sequence + random" or
- a "time + random" UUID.
In the case of the sequence of the 128 bit UUID, you could reserve 16 bits for the sequence, counting up, and then the rest would be random. In this scenario you would still get the benefit of the randomness, you would still prevent enumeration attacks, but you would get the benefit of the index overhead being fairly low as well.
Similarly, there's a concept called ULIDs (we wrote about them in our newsletter issue of April 2022. ULIDs are the same idea, but with time. We have 48 bits of time and 80 bits of randomness.
Last, there are actually efforts to standardize this better. There is a recent email thread on the Postgres hackers mailing list that talks about the upcoming UUIDv7 standard. UUIDv7 will introduce things like a time ordered UUID in a more official manner. This is exciting because I think it will really help avoid this problem going forward as we can eventually start defaulting to something that's time prefixed.