mirror of
https://git.postgresql.org/git/postgresql.git
synced 2026-02-08 10:47:32 +08:00
172 lines
6.7 KiB
Plaintext
172 lines
6.7 KiB
Plaintext
|
|
*** INSTALLING ***
|
|
|
|
0) Build, install or borrow postgresql 7.1, not 7.0. I've got
|
|
a language module for 7.0, but it has no SPI interface. Build is best
|
|
because it will allow you to do
|
|
|
|
"cd postgres/src/"
|
|
"patch -p2 < dynloader.diff"
|
|
|
|
or if that fails open linux.h in src/backend/ports/dynloader and
|
|
change the pg_dlopen define from
|
|
|
|
#define pg_dlopen(f) dlopen(f, 2)
|
|
|
|
to
|
|
|
|
#define pg_dlopen(f) dlopen(f, (RTLD_NOW|RTLD_GLOBAL))
|
|
|
|
adding the RTLD_GLOBAL flag to the dlopen call allows libpython to
|
|
properly resolve symbols when it loads dynamic module. If you can't
|
|
patch and rebuild postgres read about DLHACK in the next section.
|
|
|
|
1) Edit the Makefile. Basically select python 2.0 or 1.5, and set
|
|
the include file locations for postgresql and python. If you can't
|
|
patch linux.h (or whatever file is appropriate for your architecture)
|
|
to add RTLD_GLOBAL to the pg_dlopen/dlopen function and rebuild
|
|
postgres. You must uncomment the DLHACK and DLDIR variables. You may
|
|
need to alter the DLDIR and add shared modules to DLHACK. This
|
|
explicitly links the shared modules to the plpython.so file, and
|
|
allows libpython find required symbols. However you will NOT be able
|
|
to import any C modules that are not explicitly linked to
|
|
plpython.so. Module dependencies get ugly, and all in all it's a
|
|
crude hack.
|
|
|
|
2) Run make.
|
|
|
|
3) Copy 'plpython.so' to '/usr/local/lib/postgresql/lang/'.
|
|
The scripts 'update.sh' and 'plpython_create.sql' are hard coded to
|
|
look for it there, if you want to install the module elsewhere edit
|
|
them.
|
|
|
|
4) Optionally type 'test.sh', this will create a new database
|
|
'pltest' and run some checks. (more checks needed)
|
|
|
|
5) 'psql -Upostgres yourTESTdb < plpython_create.sql'
|
|
|
|
*** USING ***
|
|
|
|
There are sample functions in 'plpython_function.sql'.
|
|
Remember that the python code you write gets transformed into a
|
|
function. ie.
|
|
|
|
CREATE FUNCTION myfunc(text) RETURNS text
|
|
AS
|
|
'return args[0]'
|
|
LANGUAGE 'plpython';
|
|
|
|
gets tranformed into
|
|
|
|
def __plpython_procedure_myfunc_23456():
|
|
return args[0]
|
|
|
|
where 23456 is the Oid of the function.
|
|
|
|
If you don't provide a return value, python returns the default 'None'
|
|
which probably isn't what you want. The language module transforms
|
|
python None to postgresql NULL.
|
|
|
|
Postgresql function variables are available in the global "args" list.
|
|
In the myfunc example, args[0] contains whatever was passed in as the
|
|
text argument. For myfunc2(text, int4), args[0] would contain the
|
|
text variable and args[1] the int4 variable. The global dictionary SD
|
|
is available to store data between function calls. This variable is
|
|
private static data. The global dictionary GD is public data,
|
|
available to all python functions within a backend. Use with care.
|
|
When the function is used in a trigger, the triggers tuples are in
|
|
TD["new"] and/or TD["old"] depending on the trigger event. Return
|
|
'None' or "OK" from the python function to indicate the tuple is
|
|
unmodified, "SKIP" to abort the event, or "MODIFIED" to indicate
|
|
you've modified the tuple. If the trigger was called with arguments
|
|
they are available in TD["args"][0] to TD["args"][(n -1)]
|
|
|
|
Each function gets it's own restricted execution object in the python
|
|
interpreter so global data, function arguments from myfunc are not
|
|
available to myfunc2. Except for data in the GD dictionary, as
|
|
mentioned above.
|
|
|
|
The plpython language module automatically imports a python module
|
|
called 'plpy'. The functions and constants in this module are
|
|
available to you in the python code as 'plpy.foo'. At present 'plpy'
|
|
implements the functions 'plpy.error("msg")', 'plpy.fatal("msg")',
|
|
'plpy.debug("msg")' and 'plpy.notice("msg")'. They are mostly
|
|
equivalent to calling 'elog(LEVEL, "msg")', where level is DEBUG,
|
|
ERROR, FATAL or NOTICE. 'plpy.error', and 'plpy.fatal' actually raise
|
|
a python exception which if uncaught causes the plpython module to
|
|
call elog(ERROR, msg) when the function handler returns from the
|
|
python interpreter. Long jumping out of the python interpreter
|
|
probably isn't good. 'raise plpy.ERROR("msg")' and 'raise
|
|
plpy.FATAL("msg") are equivalent to calling plpy.error or plpy.fatal.
|
|
|
|
Additionally the in the plpy module there are two functions called
|
|
execute and prepare. Calling plpy.execute with a query string, and
|
|
optional limit argument, causing that query to be run, and the result
|
|
returned in a result object. The result object emulates a list or
|
|
dictionary objects. The result object can be accessed by row number,
|
|
and field name. It has these additional methods: nrows() which
|
|
returns the number of rows returned by the query, and status which is
|
|
the SPI_exec return variable. The result object can be modified.
|
|
|
|
rv = plpy.execute("SELECT * FROM my_table", 5)
|
|
|
|
returns up to 5 rows from my_table. if my_table a column my_field it
|
|
would be accessed as
|
|
|
|
foo = rv[i]["my_field"]
|
|
|
|
The second function plpy.prepare is called with a query string, and a
|
|
list of argument types if you have bind variables in the query.
|
|
|
|
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name =
|
|
$1", [ "text" ])
|
|
|
|
text is the type of the variable you will be passing as $1. After
|
|
preparing you use the function plpy.execute to run it.
|
|
|
|
rv = plpy.execute(plan, [ "name" ], 5)
|
|
|
|
The limit argument is optional in the call to plpy.execute.
|
|
|
|
When you prepare a plan using the plpython module it is automatically
|
|
saved. Read the SPI documentation for postgresql for a description of
|
|
what this means. Anyway the take home message is if you do:
|
|
|
|
plan = plpy.prepare("SOME QUERY")
|
|
plan = plpy.prepare("SOME OTHER QUERY")
|
|
|
|
You are leaking memory, as I know of no way to free a saved plan. The
|
|
alternative of using unsaved plans it even more painful (for me).
|
|
|
|
*** BUGS ***
|
|
|
|
If the module blows up postgresql or bites your dog, please send a
|
|
script that will recreate the behaviour. Back traces from core dumps
|
|
are good, but python reference counting bugs and postgresql exeception
|
|
handling bugs give uninformative back traces (you can't long_jmp into
|
|
functions that have already returned? *boggle*)
|
|
|
|
*** TODO ***
|
|
|
|
1) create a new restricted execution class that will allow me to pass
|
|
function arguments in as locals. passing them as globals means
|
|
function cannot be called recursively...
|
|
|
|
2) Functions cache the input and output functions for their arguments,
|
|
so the following will make postgres unhappy
|
|
|
|
create table users (first_name text, last_name text);
|
|
create function user_name(user) returns text as 'mycode' language 'plpython';
|
|
select user_name(user) from users;
|
|
alter table add column user_id int4;
|
|
select user_name(user) from users;
|
|
|
|
you have to drop and create the function(s) each time it's arguments
|
|
are modified (not nice), don't cache the input and output functions
|
|
(slower?), or check if the structure of the argument has been altered
|
|
(is this possible, easy, quick?) and recreate cache.
|
|
|
|
3) better documentation
|
|
|
|
4) suggestions?
|