Tracing Locks in Postgres using eBPF programs
In episode 53 of 5mins of Postgres, we talk about tracing Postgres locks using eBPF programs. Specifically, we’re looking at pg_lock_tracer and pw_lw_lock_tracer by Jan Nidzwetzki.
Let's have a look!
In this blog post, Jan from the TimescaleDB team describes two projects that he started over the holiday break. The programs help you utilize eBPF programs in Linux to get additional statistics from Postgres that are not accessible otherwise.
Importantly, these programs only work when you're running Postgres on your own virtual machine, but they are pretty cool and I think they show promise of how to use eBPF for monitoring Postgres.
First of all, Jan introduced a tool called pg_lock_tracer.
pg_lock_tracer is inspired, in spirit, by what the
pg_locks table shows you, which is which locks are currently held by which Postgres connection.
However, the difference here is that
pg_lock_tracer looks at every single locking invocation and summarizes them over time. Obviously, this has some overhead, so this is not necessarily something you would run on your production system all the time, but this actually might make sense if you are facing issues that you can't debug otherwise.
Generally speaking, an eBPF program, or a Berkeley packet filter program, is running inside Linux. It's essentially a way to have low level analysis functionality that is running through the kernel and is able to look at activity on the system or in particular processes.
pg_lock_tracer looks at statistics about taking locks, like
- the number of locks,
- the lock types, and
- the delay for acquiring the lock.
As Jan describes, this is intended for developers or sysadmins that want to get additional information about the internals of Postgres, and I think a good audience for this are the people that would usually run "perf" on a Linux VM, which you can use to see which part of Postgres is busy if you're running Postgres on your own VM. This goes beyond just showing the stack trace, but giving you information on which lock is being held.
I think a good audience for this are the people that would usually run "perf" on a Linux VM.
You can install
pip, the Python package manager. You will also need the BPF tools installed, which are required to compile the kernel specific eBPF program. Once installed, you execute the tracer against your Postgres installation and there are two requirements:
- it needs to know the location of your Postgres binary
- It needs to be able to make a connection to the database
You can issue a query, like creating a table, and then the tracer outputs which activity is happening behind the scenes when you're doing this
CREATE TABLE. Jan shows in his article that, as part of Postgres creating a table,
pg_lock_tracer will actually look at various internal catalog tables; this is a short version of all the different activity.
This can help you understand what Postgres is doing behind the scenes. At the end, the tracer outputs a summary statistic, which can be helpful to surmise things like: “most of the locks taken were on this pg_catalog "pg_depend_reference_index" relation” and gives insight into how long it took to get that lock.
You can also see the lock types, like an access share lock or row exclusive lock or access exclusive lock. This might help you if your system is very busy and you're trying to understand which locks are being held or which locks are being acquired. What's the busy part?
Another thing that Jan created was pg_lw_lock_tracer. In Postgres, there are two different lock types. There are
- lightweight locks, used internally in Postgres
- heavyweight locks, which we just had a look at.
pg_lw_lock_trace is for tracing lightweight locks.
This can be helpful if you want to go beyond the wait event sampling that a lot of tools do today, and this will let you look at the individual locks that are being held inside Postgres.*
pg_lw_lock_trace uses the USDT trace points in Postgres. These are specific trace points that are predefined in the Postgres source code, and which you can hook into, for example, with a BPF program. You have to have compiled Postgres with the
--enable-dtrace option. For historic reasons this is referred to as "dtrace", but really these days you would call this USDT probes. If you have compiled it with that, you can use the
bpftrace program to list out these trace points in the Postgres binary. Really, what you're interested in is that this shows you trace points, because it shows that Postgres was compiled with the right settings.
After you've done that, you could install
pg_lw_lock_tracer which will show you how LW locks are being acquired. And these are very fast, short locks, right? We're talking about internal locks, like the buffer manager and things like if different processes try to insert into the WAL, Postgres has a lightweight lock that is used for that.
This is something that I would be even more careful with running on an actual production system because there is a little bit of overhead to looking at these probes. But again, if you have a production situation and you're looking for more information, this may be of use for that as we have statistics and we can see how many LW locks were acquired over time and which function was used inside Postgres to acquire these. That may also help you understand when there is contention: is it waiting on this lock a lot which is then slowing down your queries?
Quite useful information that's otherwise impossible to get. If you have your own virtual machine and you're already using tools like "perf", this seems like a great addition to your toolbelt.
On GitHub, you can find the "pg_lock_tracer" as well, the "pg_lw_lock_tracer" project. I think it's really great to see more folks trying out eBPF programs, especially combining that with monitoring Postgres.