I wondered why they were doing it that way, and if it might be easier to read if it was just:SELECT a,b,c INTO foo.x, foo,y, foo.z;
Now, these aren't quite the same, especially if foo has more than three fields. But even that could be got around.foo := (a,b,c);
But before I tried this out I decided to see how they performed. Here's what happened:
andrew=# do $x$ declare r abc; begin for i in 1 .. 10000000 loop select 'a','b',i into r.x,r.y,r.z; end loop; end; $x$; DO Time: 63731.434 ms
That's a very big difference! Direct assignment takes less than 30% of the time that SELECT INTO takes.andrew=# do $x$ declare r abc; begin for i in 1 .. 10000000 loop r := ('a','b',i); end loop; end; $x$; DO Time: 18744.151 ms
I'm going to dig into why this happens, but meanwhile, I have quite a lot of low hanging performance fruit to pick as a result of this.
I check you reproducing the same relative speeds (approx. 3x), though my laptop seems to be twice as fast as yours :-) I was at first suspicious of the test constants, since in SELECT context those tend to stay 'unknown' as long as possible, I wasn't sure if they're resolv3ed earlier in the pgsql assignment. However, recasting the 10 million loop as
ReplyDeletethree nested loops i,j,k of 1000, 100, and 100, and making the assignment then k,j,i yields exactly the same result.
You're context-switching from PL/PgSQL->SQL>PL/PgSQL every time you do a SELECT INTO, so you're paying for query overhead of the parse once, and the switches on every iteration, and not paying for either with direct assignment. Doing SELECT INTO to fetch data that isn't actually in the database is a waste.
ReplyDelete