09 September 2013

PostgreSQL® 9.3

PostgreSQL® 9.3 is here! The new major version brings many improvements and new functions. This article introduces some of the interesting innovations.

Improved Locking with Foreign Keys

Foreign keys are indispensable for the referential integrity of the data. However, up to and including PostgreSQL® 9.2, these also caused locking problems under certain conditions when UPDATE was performed on columns with foreign keys. Especially with overlapping updates of several transactions via a foreign key, deadlocks can even occur. The following example is problematic in PostgreSQL® 9.2, exemplified by this simple data model:

CREATE TABLE parent(id integer, value text);
ALTER TABLE parent ADD PRIMARY KEY(id);
CREATE TABLE child(id integer, parent_id integer REFERENCES parent(id), value text);
INSERT INTO parent VALUES(1, 'bob');
INSERT INTO parent VALUES(2, 'andrea');

 

Two transactions, called Session 1 and Session 2, are started simultaneously in the database:

Session 1

BEGIN;
INSERT INTO child VALUES(1, 1, 'abcdef');
UPDATE parent SET value = 'thomas' WHERE id = 1;

 

Session 2

BEGIN;
INSERT INTO child VALUES(2, 1, 'ghijkl');
UPDATE parent SET value = 'thomas' WHERE id = 1;

 

Session 1 and Session 2 both execute the INSERT successfully, but the UPDATE blocks in Session 1 because the INSERT in Session 2 holds a so-called SHARE LOCK on the tuple with the foreign key. This conflicts with the lock that the UPDATE wants to have on the foreign key. Now Session 2 tries to execute its UPDATE, but this in turn creates a conflict with the UPDATE from Session 1; A deadlock has occurred and is automatically resolved in PostgreSQL® 9.2. In this case, however, the transaction in Session 2 is aborted:

ERROR: deadlock detected
DETAIL: Process 88059 waits for ExclusiveLock on tuple (0,1) of relation 1144033 of database 1029038; blocked by process 88031.
Process 88031 waits for ShareLock on transaction 791327; blocked by process 88059.

 

In PostgreSQL® 9.3, there is now a significantly refined locking mechanism for updates on rows and tables with foreign keys, which alleviates the problem mentioned. Here, tuples that do not represent a targeted UPDATE to a foreign key (i.e. the value of a foreign key is not changed) are locked with a weaker lock (FOR NO KEY UPDATE). For the simple checking of a foreign key, PostgreSQL® also uses the new FOR KEY SHARE lock, which does not conflict with it. In the example shown, this prevents a deadlock; the UPDATE in Session 1 is executed first, and the conflicting UPDATE in Session 2 must wait until Session 1 successfully confirms or rolls back the transaction.

Parallel pg_dump

With the new command line option -j, pg_dump has the ability to back up tables and their data in parallel. This function can significantly speed up a dump of very large databases, as tables can be backed up simultaneously. This only works with the Directory output format (-Fd) of pg_dump. These dumps can then also be restored simultaneously with multiple restore processes using pg_restore.

Updatable Foreign Data Wrapper and postgres_fdw

With PostgreSQL® 9.3, the API for Foreign Data Wrapper (FDW) for DML commands has been extended. This enables the implementation of updatable FDW modules. At the same time, the Foreign Data Wrapper for PostgreSQL® (postgres_fdw) was integrated as an extension. This now allows transparent access to remote PostgreSQL® instances. Tables appear in the database as local tables and can easily be used, for example, in complex SQL constructs such as JOINs or Views. FDWs will be explained in more detail in a later article at this point.

Event Trigger

A feature that users have been requesting for some time are Event Trigger for DDL (Data Definition Language) operations, such as ALTER TABLE or CREATE TABLE. This enables automatic reactions to certain events via DDL, such as logging certain actions. Currently, actions for ddl_command_start, sql_drop and ddl_command_end are supported. Trigger functions can be created with C or PL/PgSQL. The following example for sql_drop, for example, prevents the deletion of tables:

CREATE OR REPLACE FUNCTION del_event_func() RETURNS event_trigger AS $$
DECLARE
 v_item record;
BEGIN
 FOR v_item IN SELECT * FROM pg_event_trigger_dropped_objects()
 LOOP
 RAISE EXCEPTION 'deletion of object %.% forbidden', v_item.schema_name, v_item.object_name;
 END LOOP;
END;
$$ LANGUAGE plpgsql;
 
CREATE EVENT TRIGGER delete_table_obj ON sql_drop WHEN tag IN ('drop table') EXECUTE PROCEDURE del_event_func();
 
DROP TABLE child ;
FEHLER: deletion of object public.child forbidden

Checksums in tables

With PostgreSQL® 9.3, it is now possible in environments in which the reliability of the storage solutions used cannot be guaranteed (for example, certain cloud environments) to initialize the database cluster with checksums. PostgreSQL® organizes tuples in blocks that are 8KB in size by default. These form the smallest unit with which the database works on the storage system. Checksums now provide these blocks with additional information in order to detect storage errors such as corrupt block headers or tuple headers at an early stage. Checksums cannot be activated subsequently, but require the initialization of the physical database directory with initdb and the new command line parameter –data-checksums. This then applies to all database objects such as tables or indexes and has an impact on the speed.

Materialized Views

The new PostgreSQL® version 9.3 now also contains a basic implementation for Materialized Views. Normal views in PostgreSQL® are not materialized objects in the sense of a table. In a figurative sense, views must be imagined as a kind of macro that PostgreSQL® applies to the underlying view. For example, a SELECT * FROM <view> is then rewritten to the actual, arbitrarily complex SELECT. However, this has the consequence that with large amounts of data, the result must be planned, executed and materialized again and again. With Materialized Views, views can be created that directly hold the materialized result set. However, the implementation in PostgreSQL® 9.3 is still very generic. The REFRESH MATERIALIZED VIEW command also blocks all access to the Materialized View.

These are just a few examples from the long list of innovations that have found their way into the new PostgreSQL® version. With JSON, improvements in streaming replication and some improvements in the configuration such as shared memory usage, many other new features have been added. For those who are interested in detailed explanations of all the new features, the Release Notes are highly recommended. RPM packages for RedHat, CentOS, Scientific Linux and Fedora are available at http://yum.postgresql.org. The PGAPT repository is available for Debian. Exact instructions can be found at http://wiki.postgresql.org/wiki/Apt

This article was originally written by Bernd Helmle.

Also worth reading: PostgreSQL Foreign Data Wrapper for Informix

Categories: PostgreSQL®
Tags: PostgreSQL®

cR

About the author

credativ Redaktion

about the person

Dieser Account dient als Sammelpunkt für die wertvollen Beiträge ehemaliger Mitarbeiter von credativ. Wir bedanken uns für ihre großartigen Inhalte, die das technische Wissen in unserem Blog über die Jahre hinweg bereichert haben. Ihre Artikel bleiben hier weiterhin für unsere Leser zugänglich.

View posts


Beitrag teilen: