5mins of Postgres E25: Postgres lock monitoring, LWLocks and the log_lock_waits setting
Today, we are talking about Postgres locking. When it is concerning, what lightweight locks are, as well as the log_lock_waits
setting in Postgres.
Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel.
Transcript
Let's jump in.
Postgres lock monitoring with the pg_locks view
We'll start with this blog post by David (@pg_dwc) from the Crunchy Data team.
If you want to monitor which locks Postgres uses, you can use the pg_locks
view. The pg_locks
view is something you can query, similar to pg_stat_activity
. There are a couple of columns there that help you identify which type of lock, is it exclusive or a shared lock, the most important thing to know here is that there's a granted field, and this let's you know if a lock is currently granted to a particular connection, or if it's waiting on another connection to finish to release its locks, so that it can then take its lock. Generally, the way to think about this view is that a lot of things take, for example, AccessShare
locks, and as David states here, it's not necessarily a good idea to look at the count of locks, that doesn't necessarily tell you much.
Counting how many locks are not granted
But what can be helpful is to look at the number of locks that are not granted currently, as well as which specific statements are waiting for something. If you were to do a very simple monitoring query, having something like:
SELECT COUNT(*) FROM pg_locks WHERE NOT granted
That would give you a sense for how much the system is blocked by locks not being granted. If you saw a lot of lock contention, where things were waiting for the same lock, this would give you an indication of that.
Finding the blocking process when a query is waiting for a lock
Let's say you have a production issue where you're seeing a lot of locks currently being held. Then the best thing you can do is query the pg_stat_activity
table and query using the pg_blocking_pids function. What we're doing here is we're querying for all connections that are currently waiting on a lock, and then it gives us all the connections that are holding a lock, that are keeping us from getting that lock. This helps you identify which connection is causing the problems, and then you can go and potentially cancel that for example.
SELECT pid, pg_blocking_pids(pid), wait_event, wait_event_type, query
FROM pg_stat_activity
WHERE backend_type = 'client backend'
AND wait_event_type ~ 'Lock'
Understanding lightweight locks (LWLocks)
The other kind of locks that David talks about here are lightweight locks and you might be familiar with those if you look at pg_stat_activity
, when you see something like LWLock DataRead, for example, or BufferContent LWLock. Those are Postgres internal locks.
If you want to know more about that, you can actually take a look at the Postgres source documentation. I often find that useful to get a sense for how Postgres hackers themselves describe this. In the lock manager, which does a lot of the work in Postgres, it describes lightweight locks and these are typically used for controlling access to data structures in shared memory. Shared memory is the memory area in Postgres that's shared between different connections. You don't really have to interact with this. There is no way to explicitly take a lock, or do other direct interactions with this. But it is helpful to understand which resource in Postgres is busy, and which part the system is doing a lot of work.
The lock type that you would usually interact with, and the one that David described in his post are what this here calls "regular locks", or "heavyweight locks". Those are the ones where you have things like deadlock detection, this behavior around transaction end, and this is what Postgres describes here as a user-driven lock request. If you interact with a lock related command, then that is all about the regular locks, not about the other lock types.
Tracking locking problems over time with log_lock_waits
The other thing I want to mention around locking, is a very useful log event for finding lock problems. This older blog post by Gabrielle (@gorthx) from the PGExperts team, talks about the log_lock_waits
setting.
Postgres has a process that checks for deadlocks, and log_lock_waits
piggybacks on that deadlock logic to also check if a lock has been waiting for something for longer than a second. The log_lock_waits
setting is not on by default, you have to change your server configuration.
If you have log_lock_waits
enabled it actually logs to the Postgres log and it will give you something like the SELECT statement's process here is still waiting for this AccessShareLock for longer than one second:
LOG: process 663 still waiting for ExclusiveLock on advisory lock [233136,1,2,2] after 1000.365 ms
DETAIL: Process holding the lock: 660. Wait queue: 663.
STATEMENT: SELECT pg_advisory_lock(1, 2);
(example from Log Insights: Lock Events - L71: Process still waiting for lock on tuple / relation / object )
The other thing that's useful, if the lock was released and then the SELECT could finish, then you can actually get the second statement here, which says this process acquired this type of lock after this time:
LOG: process 583 acquired AccessExclusiveLock on relation 185044 of database 16384 after 2175.443 ms
STATEMENT: ALTER TABLE x ADD COLUMN y text;
(example from Log Insights: Lock Events - L70: Process acquired lock on tuple / relation / object )
If you have locking problems, this lets you know how much of your execution time of a query was spent just waiting for locks. That's why the log_lock_waits
setting is something that I always enable.
Should I enable log_lock_waits on my production system?
You may wonder if there is a performance cost of enabling log_lock_waits
? And the good news is, there isn't. Because this deadlock detection logic always runs automatically in the background already. You're not really adding anything new. All you're adding is a little log message that gets output.
On production systems I always turn on the log_lock_waits
setting, because it's just that useful.
Thanks so much for joining us for this episode of Postgres. Subscribe to our YouTube channel, sign up for our newsletter and follow us on Twitter to get updates about new episodes!
What we have discussed in this episode of 5mins of Postgres
Postgres Locking: When is it Concerning?
Avoid PostgreSQL LWLock:buffer_content locks in Amazon Aurora: Tips and best practices
Postgres Source Code Docs: Locking Overview
Simple test for lock_waits log messages
Postgres Log Monitoring 101: Deadlocks, Checkpoint Tuning & Blocked Queries
Log Insights: Lock Events - L71: Process still waiting for lock on tuple / relation / object
Log Insights: Lock Events - L70: Process acquired lock on tuple / relation / object
What is the performance cost of enabling "log_lock_waits" on Postgres?