Thursday, December 27, 2012

pg_stat_statements - a couple more stats would be nice

I just started looking at a production system with the pg_stat_statements module loaded. We started doing this after Peter Geoghegan's fantastic work on normalizing the statements made it much more useful to us, which was one motivating factor in migrating to PostgreSQL 9.2.

I was immediately struck with a wish for more stats. Specifically, I'd like stats on the spread of times for the calls - the minimum, maximum and standard deviation of the calls would be great things to have. Then we could separate queries that just generally run slowly from queries that mostly work well but have occasional spectacularly bad running times.


  1. Thanks for the kind words Andrew.

    I am of two minds on this. On the one hand, I'm inclined to think of pg_stat_statements as a low-level view that people should be constructing time-series data from. Tools like Munin already facilitate this sort of analysis, graphing data from "snapshot" views like pg_stat_bgwriter. It feels like we should do something similar here (though Munin certainly isn't up to it). On the other hand, I recognise that having things like the maximum time *would* be very useful, and the old basic approach of constructing time series data just isn't going to cut it, because its so easy to lose an outlier in the noise no matter how frequently those snapshots are taken.

    Incidentally, I am aware of one tool that aggregates information collected by pg_stat_statements, called datascope:

    I think that this is still just scratching the surface, though.

    I'm particularly proud of the performance characteristics of the new pg_stat_statements - unsympathetic benchmarks show a single digit percentage overhead, as fingerprinting is just a final additional stage of parsing, and as such takes advantage of temporal locality very well. It would be trivially simple to write a patch that does what you've suggested here, and perhaps you should. I'm a little bit paranoid about bloating pg_stat_statements though, so anyone who adds more stats to pg_stat_statements should be prepared to defend their additions against concerns about added overhead.

    Note that pg_stat_plans currently provides a method of viewing the stddev of average execution time of the set of plans executed for each query. Admittedly this is fairly grotty compared to pg_stat_statements (though not when compared to traditional log-analysis tools). The idea is that execution costs should be more directly blamed on the plan being executed rather than the query, in order to facilitate finding planner regression, but of course there could be huge variability in the execution time of a query/plan due to other factors, like, for example, lock contention, and so we could still miss notable variability.

    This stddev value is available from the pg_stat_plans_queries view.

  2. My approach was to add a 10 element counter array to pg_stat_statements that did counts indexed by log10 execution time. This gave me a histogram of counts that started with queries running faster than 10us and went all the way up to queries taking longer than 1000s. This way it's very easy to figure out if it's in fact running generally fast with a few very bad runtimes. In principle the buffer stats could also be split this way to see if the reason for long runtime is just due to processing more data or due to IO time.

    The overhead for tracking this was pretty much negligible in both space and time. If you are interested I can send you the WIP level patch I have lying around somewhere.

    1. That's probably not a bad approach. Did you think of submitting it as a patch?