Thursday, February 19, 2009

PostgreSQL C Function Debugging

As of late I have been becoming more and more interested in PostgreSQL(pg). I have a few high level reasons for this interest. One the economy is down and it seems of value to learn about a full featured 100% free database. You never know when a database like pg might fill in a business need that would of been filled with an expensive proprietary database when the economy is strong. It's also got a nice set of features and a really great/smart/geeky community to help and support you.

This post will only focus on creating and debugging a c function that you can create in pg.

To read up on what has already been typed up on the nitty gritty of pg check out these links.
General all encompassing ->http://www.postgresql.org/docs/8.3/interactive/index.html
Stored Functions in c ->http://www.postgresql.org/docs/8.3/interactive/xfunc-c.html

I run my pg server on old desktop pc hardware. The CPU is 800MHZ and there is one gig of ram. The box is running FreeBSD 7.0 32-bit. Pg is version 8.3.5.

Here is my super simple and almost pointless C function that I used to grasp the basics creating and debugging a c function in pg.
/* foo.c*/
#include "postgres.h"
#include
#include "fmgr.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(c_loop);

Datum
c_loop(PG_FUNCTION_ARGS)
{
int32 arg = PG_GETARG_INT32(0);
while (arg >= 0)
{
arg--;
}

PG_RETURN_INT32(arg);
}

To create a c function and compile it follow the steps I have outlined bellow.

Prerequisites for how I did this:
  1. Pg server compiled with --enable-debug (NOT 100% sure if this is needed yet need to double check)
  2. The pg server up and running and you are able to connect to it.
  3. Make sure you have the pg source code on the same machine you are compiling on. To compile your c function you need some of the pg source code. I'm doing all my compiles on FreeBSD 7.0 so if you are using another OS look here http://www.postgresql.org/docs/8.3/interactive/xfunc-c.html in section titled 34.9.6. Compiling and Linking Dynamically-Loaded Functions.
  4. I'm logged in as the postgres UNIX account when creating my file and compiling. I'm also logged in as the postgres account when connecting to the database via psql command line tool.
  5. You have a c compiler and gdb installed.
Steps to compile and debug:
  1. Copy the code above into file foo.c and save it.
  2. Compile foo.c with debug symbols by adding the -g option. For example on FreeBSD you do this gcc -g -fpic -c foo.c -I/usr/home/postgres/src/include and then gcc -g -shared -o foo.so foo.o. You shouldn't get any errors if all goes well.
  3. Make sure your foo.so is in a location that the pg server can see. If postgres is the account running the server which most likely it will be make sure you put the file in a location that postgres has access to. I put mine in a directory that postgres owns full rights to.
  4. Execute the following from psql.
    CREATE FUNCTION public.c_loop(integer) RETURNS integer
    AS '/usr/home/postgres/foo', 'c_loop' /*Change to point to your foo file location*/
    LANGUAGE C STRICT;
    This creates a stored function in the database that we can call via SQL to invoke our C function.
  5. To make sure all is well with our new database function execute select public.c_loop(10); from psql. It should execute to completion and return -1.
  6. Now onto the debugging. Log into psql and LOAD the C function like this -> load '/usr/home/postgres/foo'; This will cause the pg server to load the object so we can debug it.
  7. Find the back end process that you are connected to in psql by running select pg_backend_pid(); while in the same psql session.
  8. Open another terminal session and run gdb for the pid returned from step 7 like this -> gdb /usr/local/pgsql/bin/postgres 27069
  9. Now from the gdb session you started in step 8 set a break point in the c function by typing -> b c_loop
  10. Now type continue or cont
  11. Go back to your psql session and execute the function via SQL like this -> SELECT public.c_loop(10);
  12. Go back to your gdb session and check that you hit the break point. You can now step through the C function in gdb. Type s and hit enter and you should be stepping through the code.
You now have a high performance stored function language that you can depend on for computational intensive needs and the added bonus of being able to debug the code with a debugger. Not bad for all of this being free. Feel free to ask questions and I can try and answer them.