Friday, December 19, 2014

Waiting for extra statement stats

One of the things I was hoping would make it into PostgreSQL 9.4 and didn't was some extra statistics in pg_stat_statements. Specifically, it was proposed to add minimum, maximum and (after I proposed it and pressed the case) the standard deviation for execution times. Keeping these stats is very cheap, so there should be close to zero performance impact. They will give you a much better idea of the spread of execution times than a simple average, which might be significantly dominated by outlier times. Unfortunately, these got left out for 9.4, which is rather disappointing. I'm going to see if I can revive the proposal for 9.5.

5 comments:

  1. I tend to think that making pg_stat_statements use atomic fetch-and-add will make this process a lot easier.

    ReplyDelete
    Replies
    1. I, too would like to know what it is. One of my main worries about the previous patch is that it uses a naive method of calculating the standard deviation, rather than Welford's method. See http://www.johndcook.com/blog/standard_deviation/ for details.

      Delete
  2. Seems like a good thing to have. Reading this I started Googling "pg_stat_statements vs pgBadger" (since we use the latter, but now I can't remember if pgBadger does report minimum, maximum and standard deviation execution times). I stumbled across this site http://bonesmoses.org/tag/pg_stat_statements/ where the author says "The only real answer to this question is: do not use pgBadger. Before anyone asks—no, you shouldn’t use pgFouine either. This is not an indictment on the quality of either project, but a statement of their obsolescence in the face of recent PostgreSQL features." yet he doesn't back up this claim to explain why I should be using built-in features when pgBadger does the job for us.

    ReplyDelete
  3. Let's keep to the topic, please. IIRC Greg Smith and Peter Geoghegan gave a talk about why pg_stat_statements is better than pgbadger / pgfouine.

    ReplyDelete