In this post I will explain how VACUUM
might obtain an AccessExclusiveLock
on a table with a very specific layout due to truncation maintenance operations. Combining that behavior with the unexpected nature of AutoVacuum can create undesirable scenarios at the worst times.
AutoVacuum
PostgreSQL autovacuum is a background process which performs maintenance operations on database tables. You can think of it as a job that starts running based on some heuristic (which can be configured) and executes the VACUUM
command on a table.
My understanding was that VACUUM
was safe to run online, that it wouldn’t cause any outage in production as it doesn’t acquire an AccessExclusiveLock
. I frequently check this website to learn which locks are acquired by each PostgreSQL command. And it shows that VACUUM
only obtains an ShareUpdateExclusiveLock
(which it does under normal circumstances).
But there is one specific case where VACUUM
(and consequently the autovacuum process too) will acquire an AccessExclusiveLock
:
The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained. docs ref
What does that mean?
If you are deleting old records from a table, leaving empty pages (pages with only dead tuples) at the end of your table (physical position in disk), VACUUM
will obtain an AccessExclusiveLock
on that table, if it can obtain it immediately. It will then truncate those empty pages to free up some disk space.
This is exactly what happened to me as we were running a job that was migrating old records to cold storage and deleting them afterward from the table. This generated the perfect scenario for AutoVacuum to lock the table for a while and cause a production outage.
Solutions
Disabling truncation
It is possible to configure the VACUUM
command to disable the truncation behavior by defining the following setting on a per-table basis:
ALTER TABLE
my_table
SET
vacuum_truncate=false,
toast.vacuum_truncate=false;
If you are running VACUUM
manually instead of AutoVacuum, you can parameterize the command itself:
VACUUM (TRUNCATE FALSE) my_table;
Although you will want to free up that disk space at some point. If you are still afraid of AccessExclusiveLock
you can execute an online (concurrent) table rewrite with pg_repack.
Use TRUNCATE
Another solution is to execute the TRUNCATE
command (instead of DELETE
) when removing the rows you are migrating. This will acquire an AccessExclusiveLock
and free up disk space.
I haven’t tried it yet, but it should be very fast to execute, if the truncation is executed in controlled batches of rows, the AccessExclusiveLock
shouldn't be as big of a problem, and we can save ourselves from an outage. In my case VACUUM
started to truncate millions of rows at once, which seems to have locked the table long enough to trigger our on-call monitors.
Community notes
It seems like this truncation shouldn't be a big problem, and some people have pointed out that this post might be exaggerated or too alarmist. However, there have been other responses explaining that when configured replicas are in place, they might exacerbate the issue. This is because they will have to replicate the truncation by blocking on their side and applying the changes made on the main database by autovacuum.
Except if you have streaming replicas. The problem then is that replicas MUST replay the truncation, so either replication blocks or the replica needs to kill any blocking queries. Hot Standby Feedback does NOT do anything to alleviate this either.
When I encountered this years ago it was exacerbated by the use of read replicas. The vacuum process has no way of knowing that it's blocking a query on the read replica.
Once you know about this particularity of VACUUM
, it is trivial to avoid it in the future. Interestingly enough, I had never heard about this specific behavior before, and my understanding was that VACUUM
was "safe to run but doesn’t recover disk space" and VACUUM FULL
was the heavy, dangerous, full-locking, version of it.
Turns out VACUUM
can be dangerous too.
DISCUSS ON REDDIT