I love emergency calls. They give me a chance to get into something new, very quickly, and get them fixed. A few days ago a company called SpaceInch made a new app called Say The Same Thing, which became an instant hit, possibly due to the quite awesome video by OK Go, which you should watch even if you don't look at anything else in this post.
Since the launch they have been hit by a huge flood of users and last night things started to drag, and they asked PostgreSQL Experts for some emergency help. The app is backed by a PostgreSQL database running on Heroku. During a phone call of two hours, I was able to get in, examine the database, and devise some steps that we hope will have the effect of making things run smoother.
The solution is going to involve some partitioning of one or two tables. This isn't being done so much for the benefit of constraint exclusion as because it lets them drop large amounts of data that is no longer needed, cheaply. Previously they had been cleaning one of these tables by deleting rows, but abandoned that as it adversely affected performance. But the consequence of that was that the table just grew and grew to several million rows in a few days. Dropping child tables that contain data that is of no further interest will work well to tame the data size without all the overhead of row deletion.
Troubles sometimes don't come singly. I actually had to deal with two emergencies last night, which is quite unusual - the other was also for a client running on Amazon. For good or ill, we are seeing more and more use of Postgres instances running in the cloud - whether managed by the client or on a managed service such as Heroku.
Not sure if you're doing time or serial based partitioning, but an extension I've been working on for managing it could possibly make things easier
ReplyDeletehttps://github.com/keithf4/pg_partman
It's time based. I'll bear it in mind.
ReplyDelete