Wednesday, December 31, 2014
I got an idea the other day about how to get over the recent problem I had encountered with pg_repack colliding with a lock request on the table made while it was copying data, but before it tried to rebuild indexes. After some considerable time with the debugger, I found that the problem is that pg_repack tries to get the index definition only at the time it builds the index, and that pg_get_indexdef() requires an Access Share lock on the table. The end result is that the lock request gets blocked waiting for pg_repack, and pg_repack gets blocked waiting for the lock requester, and there they stay. My initial solution was to move the index definition fetching to just after the place where the table definition is fetched. However, that ran into problems with existing strong locks, because it's before the brief time near the start of pg_repack's run where it obtains a strong lock on the table and cancels anything else holding such a lock. So I then moved it inside the section where it already holds a strong lock on the table. With that change, it now works both when there is an existing strong lock on the table and when there is a subsequent strong lock request on the table during the copy data phase. I've sent a pull request for this fix to the repo.