VACUUM: Bloat - VACUUM Blocked by Xmin Horizon
Check Frequency
Every 30 minutes
Default Configuration
Detects when the xmin horizon on the server was assigned more than 24
hours ago and creates an issue with severity "warning". The issue will be created even if no VACUUM is currently blocked by this, as this will potentially block any future VACUUMs. Resolves once the assignment age of the xmin horizon is under the threshold.
This check is enabled by default. These parameters can be tuned in the Configure section of the Alerts & Check-Up page.
Guidance
Impact
VACUUM is potentially blocked by the xmin horizon.
The xmin horizon tells you up to which point the vacuum process can clean up dead rows. When this value is behind and not advancing, VACUUMs will be blocked and will not be able to clean up dead rows.
When VACUUM is blocked and dead rows can't be cleaned, it can result in table bloat and slow queries.
The xmin horizon tracked by pganalyze is the oldest xmin across all databases, summarized as a server-wide metric. Note that while some types of xmin horizon delays affect all databases on a server, such as the xmin for each physical replication slots, in other cases not all databases of a server will be impacted, as Postgres can still clean up dead rows of tables on unrelated databases. For example, if the xmin horizon is held back by a long running transaction on database1, Postgres can still clean up dead rows of tables in database2, as the transaction of database1 will never be able to see database2.
Common Causes and Solutions
Long-running transactions
Long-running transactions may still need to access rows that could otherwise be considered dead, so they can block cleanup.
Solution
You can find the transaction holding back the xmin horizon and its connection's pid by running the following command:
SELECT pid, datname, usename, state, backend_xmin, backend_xid FROM pg_stat_activity WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;
You can cancel it by running either of commands:
SELECT pg_cancel_backend('<query_pid>');
Lagging or stale physical replication slots
With physical streaming replication with
hot_standby_feedback
is on, when replication is lagging or a replica server is stale (e.g. down), the oldest transaction that the replication slot needs the database to retain can be "stuck", holding back the xmin horizon.Solution
You can check the replication status on the Replication page.
If the replication slot is no longer used, you can remove it by running the following command:
SELECT pg_drop_replication_slot('<slot_name>');
Lagging or stale logical replication slots
With logical replication slots, replication can also get stale when DDL changes (database migrations) don't get applied to a replica server (subscriber). When the subscriber was unable to replicate data due to a schema mismatch, replication will error and get stale. This causes the xmin of the system catalogs of the primary (publisher) to be held back until replication resumes.
Solution
You can check the replication status on the Replication page. You may also want to check logs on both the publisher and the subscriber for any error messages regarding logical replication, such as schema differences.
If the replication slot is no longer used, remove it it by running the following command:
SELECT pg_drop_replication_slot('<slot_name>');
Long-running queries on standbys
When
hot_standby_feedback
is on with physical streaming replication, queries on standbys will hold back the xmin horizon just as if they were running on the primary.Solution
You can find the
xmin
of all standby servers by running the following command:SELECT application_name, client_addr, backend_xmin FROM pg_stat_replication ORDER BY age(backend_xmin) DESC;
Once the standby is identified, you can find the query holding back the xmin horizon and its connection's pid in that standby by running the following command:
SELECT pid, datname, usename, state, backend_xmin, backend_xid FROM pg_stat_activity WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;
Abandoned prepared transactions
A transaction prepared for a two-phase commit will prevent cleanup until it is either committed or rolled back.
Solution
You can find the prepared transaction by running the following command:
SELECT gid, prepared, owner, database, transaction AS xmin FROM pg_prepared_xacts ORDER BY age(transaction) DESC;
Once identified, you can either commit or cancel the transaction by running either of commands:
COMMIT PREPARED <gid_from_above>; ROLLBACK PREPARED <gid_from_above>;
Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →