Tuesday, April 1, 2014

Almost a crosstab using JSON

Most of the recent focus on the new json facilities has been on jsonb. This is understandable - it shows great promise, and a lot of people put in a lot of work on it. Besides myself, there were Teodor Sigaev, Oleg Bartunov, Alexander Korotkov and Peter Geoghegan as principal authors, so it was quite a large effort.

But there are some other new facilities for JSON that shouldn't be forgotten. I was reminded today of some of the things we could do. In particular, we can generate json on all sorts of interesting and useful ways.

A client was finding the crosstab functionality of the tablefunc module unwieldy. The trouble is you need to know in advance what the categories are. But we can construct something with the JSON functions that doesn't need any such knowledge. It's not quite a crosstab, but it will give you very similar functionality. Instead of rolling up the values into SQL fields named after the categories, we roll them up as values in a JSON object where the fields are named for the categories.

It's fairly simple.

Consider the following table:
andrew=# select * from xx;
 a | b | c  
---+---+----
 1 | b | x
 1 | c | y
 1 | d | z
 2 | d | dz
 2 | c | cz
 2 | b | bz
 2 | e | ez
(7 rows)
To get a crosstab we do something like this:
andrew=# select * 
from crosstab('select * from xx') a(r int, b text, c text, d text, e text);
 r | b  | c  | d  | e  
---+----+----+----+----
 1 | x  | y  | z  | 
 2 | dz | cz | bz | ez
(2 rows)
To get the JSON object, instead we do this:
andrew=# select a as r, json_object(cols, vals) as pivot 
from (select a, array_agg(b order by b) as cols, array_agg(c order by b) as vals 
      from xx group by a) x;
 r |                      pivot                       
---+--------------------------------------------------
 1 | {"b" : "x", "c" : "y", "d" : "z"}
 2 | {"b" : "bz", "c" : "cz", "d" : "dz", "e" : "ez"}
(2 rows)
Note how we didn't need to supply a column list reflecting the category values. The new json_object() function takes one or two arrays and outputs the corresponding JSON object.

It's also possible to do this with hstore today:

andrew=# select a as r, hstore(cols, vals) as pivot 
from (select a, array_agg(b order by b) as cols, array_agg(c order by b) as vals 
      from xx group by a) x;
 r |                   pivot                    
---+--------------------------------------------
 1 | "b"=>"x", "c"=>"y", "d"=>"z"
 2 | "b"=>"bz", "c"=>"cz", "d"=>"dz", "e"=>"ez"
(2 rows)

3 comments:

  1. Thanks, this is very useful.

    The first example has 'a' where you mean 'as', it should be

    select * from crosstab('select * from xx') AS ....

    Look forward to your talk at PGCon.

    ReplyDelete