« Archives in May, 2010

Write a Stored Procedure in Postgres 8+

Stored Procs

Sometimes as a developer we’re tasked with data intensive work like importing data into a database, cleaning up sets of incomplete records or transferring data from one table to another through some kind of filter. While our application would normally be in charge of creating and maintaining the data, sometimes we don’t want to end up writing an entire module or mini application to address these tasks. Since they’re data intensive, a stored procedure might be a good approach to take. Stored procedures are a type of program written using a more robust version of sql (structured query language) that allows for the manipulation of data records directly within a database environment.

If we were to write the equivalent code using a layer written in java, .net, or php, there would be a lot of overhead cost in terms of processing power and performance – orders of magnitude more. As data is processed, results would normally be returned to that calling layer and shuffled around that layer’s memory, essentially adding another step to the process. If we make these changes as close to the data as possible, we’ll be able to squeeze as much performance as possible and suffer the least amount of overhead. Just for perspective here’s an example: a 1 gigabyte file could take several hours to import using java business logic, while a stored proc could take less than half an hour. Mileage may vary of course, but that’ll give you an idea of the performance cost you could save with data intensive tasks like that. A word of caution though: I’m not saying a stored proc is the way to go for your entire application; it’s merely a tool that can be used in your arsenal to get the job done with the most efficient means possible.


Here’s an example of a generic stored proc written in psql (postgres version).

CREATE OR REPLACE FUNCTION example_stored_proc() RETURNS void AS $$ 
     userRecord record; 
     user_property_id bigint;
     FOR userRecord IN  
          SELECT * FROM tb_user u ORDER BY u.user_id 
          SELECT INTO user_property_id nextval('sq_user_property'); 

          -- user_property_id now has a value we can insert here
          INSERT INTO tb_user_property VALUES(
                    user_property_id ,
          ) ; 
          IF userRecord.email like 'user@domain.com' THEN

                    update userRecord set email = 'user@other-domain.com' where id = userRecord.id;

          ELSEIF userRecord.email is null THEN

                    update userRecord set active = false where id = userRecord.id;


                    RAISE NOTICE 'didn't update any record';

          END IF;

          RAISE NOTICE 'added property for user id: %', userRecord.id; 
     END LOOP; 
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION example_stored_proc() RETURNS integer AS $$
CREATE OR REPLACE FUNCTION will create the stored proc in the database. RETURNS declares the data type returned at the end. This example returns an integer, but a record or a result set may also be returned. The text in between the two pairs of $$ is the body of the procedure.

This keyword initializes the variables the stored proc will be using. It essentially lets the database know to allocate memory for use.

This marks the beginning of the stored proc logic. It naturally ends with END.

FOR userRecord IN
SELECT * FROM tb_user u ORDER BY u.user_id

– – do stuff


This is the basic looping structure used in psql. Notice the loop is built around a straight forward sql query – here is where the magic happens. The looping variable in this example is “userRecord” – it holds the currently fetched data record and allows you to manipulate it for your own means in the body of the loop. So, if you wanted to insert the value of userRecord.id into a table, you could just stick in the insert statement as a variable as shown in the insert statement in particular loop’s body.


Using this construct allows you to create a temporary table to hold query results for later use. Your variable can be a record or a single column value. In order for it to work you need to declare the variable that’s going to take the value in the DECLRARE section of the stored proc. Inline variable declaration is not supported.


As expected, the IF/THEN/ELSEIF/ELSE/END IF construct can be used to create conditional sequences of logic. The conditionals need to be any kind of expression postgres can evaluate. The ELSEIF can be used to wrap secondary conditionals, while the ELSE of course is the default if no other conditions are met. Fairly self explanatory.


This is your standard psql logging output statement. The text in the single quotes is output to the console/message window, and every “%” is substituted with the ordered value after each comma in the statement. So, in this case “userRecord.id” is substituted into the first % to appear in the output text. If you wanted to have multiple values output you could construct your RAISE NOTICE like this:

RAISE NOTICE 'this is record % out of 1000, and its value is %', record_number, record_value; 

It would substitute record_number into the first % and record_value into the second % appearing in the text.