Thursday, April 18, 2013

unnest_ordinality extension

The other day Josh Berkus was working on one of our clients and needed a fast way to unnest some arrays with ordinality, that is to have a row number attached to each element. The usual way that is recommended for doing this is to use subscripting with generate_series(), or calling unnest() in a plpgsql function and then returning the elements one by one. Both of these turned out to be not fast enough, and he was resorting to some fairly ugly code to get around the speed issue. I offered him a simple C function that would do this very quickly, and we're releasing it publicly today. In use on our client's site it has been extremely fast and simple to use.

example:
andrew# create extension unnest_ordinality;

    andrew=# select * from unnest_ordinality('{a,b,c,d,e,f,g}'::text[]);
     element_number | element 
    ----------------+---------
                  1 | a
                  2 | b
                  3 | c
                  4 | d
                  5 | e
                  6 | f
                  7 | g
The package is available on The PostgreSQL Experts Inc Github Repository and also on PGXN

15 comments:

  1. Surely something I would love to use. Thanks Andrew!

    I typically achieve that now by:

    select row_number() over(),* from unnest('{a,b,c,d,e,f,g}'::text[]);

    ReplyDelete
  2. We tried that. This is LOTS faster.

    ReplyDelete
  3. Andrew,

    Had trouble compiling on my 9.2 windows mingw64. I had to take out the line

    #include "access/htup_details.h"

    Once I took that out seemed to work fine (compiled and quick test even on my VC 9.2 EDB install).

    With the line I was getting errors like:

    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup_details.h:130:
    3: error: conflicting types for 'DatumTupleFields'
    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup.h:131:3: note:
    previous declaration of 'DatumTupleFields' was here
    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup_details.h:132:
    8: error: redefinition of 'struct HeapTupleHeaderData'
    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup.h:133:16: note
    : originally defined here
    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup_details.h:459:
    8: error: redefinition of 'struct MinimalTupleData'
    c:/ming64/projects/pgx64/pg92/include/POSTGR~1/server/access/htup.h:461:16: note
    : originally defined here

    Not sure if others ran into same issue or just a mingw64 one.
    Thanks,
    Regina

    ReplyDelete
  4. I've pushed a fix - I had the version string comparison wrong.

    But something looks rather wrong with your installation, too. access/htup_details.h shouldn't even exist on a 9.2 installation.

    ReplyDelete
    Replies
    1. I thought it always did, but they shuffled functions around. In 9.3 it broke our postgis raster functionality the shuffling so had to make changes just for IFDEF for 9.3

      Delete
    2. I have opted to use separate branches in things like FDWs, that match the PostgreSQL branches. That's probably not possible for things like PostGIS. But you really shouldn't be building against unclean Postgres sources, which is what this looks like.

      Delete
    3. Nevermind I see you are right. looking back at this ticket

      http://trac.osgeo.org/postgis/ticket/2013

      I wonder if maybe I accidentally installed 9.3 in my 9.2 cluster once so it has a mix. I'll rebuild that.

      Delete
  5. Very useful! Maybe something which belongs in core?

    Just curious, what was the actual use-case? I know I've needed this before, but can't remember in what situation.

    ReplyDelete
    Replies
    1. I forget - you'd need to ask Josh :-)

      Delete
    2. There's a patch for full WITH ORDINALITY support floating around which will almost certainly be in 9.4.

      Delete
    3. I know. But lots of people can't wait that long. 9.4 is likely to be more than a year away and many people don't upgrade immediately. This was wanted in a hurry.

      Delete
    4. I've got a patch in for the next CommitFest which implements this in core. You can find it, including associated docs, here: https://commitfest.postgresql.org/action/patch_view?id=1074

      Let me know what you think :)

      david AT fetter DOT org

      Delete
    5. @Andrew:

      Yes, I was only explaining that it wouldn't make sense to add this into core because something better is in the works and we couldn't backpatch this function anyway. :-)

      But before 9.4 is out, I think this is a handy extension, so thank you for your work.

      Delete
  6. Teensy nit here: your version puts the ordinality column at the beginning, where the standard (and hence my patch) puts it at the end.

    ReplyDelete
    Replies
    1. I wasn't trying to implement the standard :-)

      Delete