05 August 2014

Informix Foreign Data Wrapper for PostgreSQL®

Since PostgreSQL® 9.1, PostgreSQL® has implemented an interface for accessing external data sources. The interface defined in the SQL Standard (SQL/MED) allows transparent access to and manipulation of external data sources in the same manner as with PostgreSQL®-native tables. External data sources appear as tables in the respective database and can be used without restriction in SQL statements. Access is implemented via a so-called Foreign Data Wrapper (FDW), which forms the interface between PostgreSQL® and the external data source. The FDW is also responsible for mapping data types or non-relational data sources to the table structure accordingly. This also enables the connection of non-relational data sources such as Hadoop, Redis, among others. An overview of some available Foreign Data Wrappers can be found in the PostgreSQL® Wiki.

Informix FDW

In the context of many Informix migrations, credativ GmbH developed a Foreign Data Wrapper (FDW) for Informix databases. This supports migration, as well as the integration of PostgreSQL® into existing Informix installations, to facilitate data exchange and processing. The Informix FDW supports all PostgreSQL® versions with SQL/MED implementation, i.e., from 9.1 up to the current 9.3. Furthermore, with PostgreSQL® 9.3, the Informix FDW also supports write operations on Foreign Tables.

Installation

The installation requires at least an existing IBM CSDK for Informix. The CSDK can be obtained directly via download. In the following, we assume that a CSDK installation exists in /opt/IBM/informix. The path may vary depending on the installation. The Informix FDW sources can be cloned directly from the credativ GitHub repository, or a release tarball can be acquired (version 0.2.1 at the time of this article). The following installation description assumes a system with CentOS 6, but can be performed with variations on any other Linux system.

% wget 'https://github.com/credativ/informix_fdw/archive/REL0_2_1.tar.gz'
% tar -xzf REL0_2_1.tar.gz

 

Subsequently, the FDW can be built by specifying the CSDK installation folder:

% export INFORMIXDIR=/opt/IBM/informix
% export PATH=$INFORMIXDIR/bin:$PATH
% PG_CONFIG=/usr/pgsql-9.3/bin/pg_config make

 

In the listing, the environment variable PG_CONFIG was explicitly set to the PostgreSQL® 9.3 installation. This refers to an installation of PGDG-RPM packages on CentOS 6, where pg_config is located outside the standard path. Furthermore, the postgresql93-devel package must be installed. If all paths have been set correctly, the FDW can be installed.

% PG_CONFIG=/usr/pgsql-9.3/bin/pg_config make install

 

This installs all necessary libraries. For these to function correctly and for the required Informix libraries to be found when loading into the PostgreSQL® instance, the latter must still be configured in the system’s dynamic linker. Based on the previously mentioned CentOS 6 platform, this is most easily done via an additional configuration file in /etc/ld.so.conf.d (this requires root privileges in this case!):

% vim /etc/ld.so.conf.d/informix.conf

 

This file should contain the paths to the required Informix libraries:

/opt/IBM/informix/lib
/opt/IBM/informix/lib/esql

 

Subsequently, the dynamic linker’s cache must be refreshed:

% ldconfig

 

The Informix FDW installation should now be ready for use.

Configuration

To use the Informix FDW in a database, the FDW must be loaded into the respective database. The FDW is a so-called EXTENSION, and these are loaded with the CREATE EXTENSION command:

#= CREATE EXTENSION informix_fdw;
CREATE EXTENSION

 

To connect an Informix database via the Informix FDW, one first needs a definition of how the external data source should be accessed. For this, a definition is created with the CREATE SERVER command, containing the information required by the Informix FDW. It should be noted that the options passed to a SERVER directive depend on the respective FDW. For the Informix FDW, at least the following parameters are required:

  • informixdir – CSDK installation directory
  • informixserver – Name of the server connection configured via the Informix CSDK (see Informix documentation, or $INFORMIXDIR/etc/sqlhosts for details)

The SERVER definition is then created as follows:

=# CREATE SERVER centosifx_tcp
  FOREIGN DATA WRAPPER informix_fdw
  OPTIONS(informixdir '/opt/IBM/informix',
  informixserver 'ol_informix1210');

 

The informixserver variable is the instance name of the Informix instance; here, one simply needs to follow the specifications of the Informix installation. The next step now creates a so-called Usermapping, with which a PostgreSQL® role (or user) is bound to the Informix access data. If this role is logged in to the PostgreSQL® server, it automatically uses the specified login information via the Usermapping to log in to the Informix server centosifx_tcp.

=# CREATE USER MAPPING FOR bernd
  SERVER centosifx_tcp
  OPTIONS(username 'informix',
  password 'informix')

 

Here too, the parameters specified in the OPTIONS directive depend on the respective FDW. Now, so-called Foreign Tables can be created to integrate Informix tables. In the following example, a simple relation with street names from the Informix server ol_informix1210 is integrated into the PostgreSQL® instance. It is important that the conversions of data types from Informix to PostgreSQL® occur with compatible data types. First, the definition of the Informix table as it was created in the Informix instance:

CREATE TABLE osm_roads(id bigint primary key,
  name varchar(255),
  highway varchar(32),
  area varchar(10));

 

The definition in PostgreSQL® should be analogous; if data types cannot be converted, an error will be thrown when creating the Foreign Table. When converting string types such as varchar, it is also important that a corresponding target conversion is defined when creating the Foreign Table. The Informix FDW therefore requires the following parameters when creating a Foreign Table:

  • client_locale – Client locale (should be identical to the server encoding of the PostgreSQL® instance)
  • db_locale – Database server locale
  • table or query – The table or SQL query on which the Foreign Table is based. A Foreign Table based on a query cannot execute modifying SQL operations.
#= CREATE FOREIGN TABLE osm_roads(id bigint,
  name varchar(255),
  highway varchar(32),
  area varchar(10))
  SERVER centosifx_tcp
  OPTIONS(table 'osm_roads',
  database 'kettle',
  client_locale 'en_US.utf8',
  db_locale 'en_US.819');

If you are unsure what the default locale of the PostgreSQL ® database looks like, you can most easily display it via SQL:

=# SELECT name, setting FROM pg_settings WHERE name IN ('lc_ctype', 'lc_collate');
  name | setting  
------------+------------
  lc_collate | en_US.utf8
  lc_ctype | en_US.utf8
(2 rows)

 

The locale and encoding settings should definitely be chosen to be as consistent as possible. If all settings have been chosen correctly, the data of the osm_roads table can be accessed directly via PostgreSQL®:

=# SELECT id, name FROM osm_roads WHERE name = 'Albert-Einstein-Straße';
  id | name  
------+------------------------
  1002 | Albert-Einstein-Straße
(1 row)

 

Depending on the number of tuples in the Foreign Table, selection can take a significant amount of time, as with incompatible WHERE clauses, the entire data set must first be transmitted to the PostgreSQL® server via a full table scan. However, the Informix FDW supports Predicate Pushdown under certain conditions. This refers to the ability to transmit parts of the WHERE condition that concern the Foreign Table to the remote server and apply the filter condition there. This saves the transmission of inherently useless tuples, as they would be filtered out on the PostgreSQL® server anyway. The above example looks like this in the execution plan:

#= EXPLAIN (ANALYZE, VERBOSE)
  SELECT id, name  
  FROM osm_roads WHERE name = 'Albert-Einstein-Straße';
  QUERY PLAN  
-----------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.osm_roads (cost=2925.00..3481.13 rows=55613 width=24) (actual time=85.726..4324.341 rows=1 loops=1)
  Output: id, name
  Filter: ((osm_roads.name)::text = 'Albert-Einstein-Straße'::text)
  Rows Removed by Filter: 55612
  Informix costs: 2925.00
  Informix query: SELECT *, rowid FROM osm_roads
  Total runtime: 4325.351 ms

 

The Filter display in this execution plan shows that a total of 55612 tuples were filtered out, and ultimately only a single tuple was returned because it met the filter condition. The problem here lies in the implicit cast that PostgreSQL® applies in the WHERE condition to the string column name. Current versions of the Informix FDW do not yet account for this. However, predicates can be transmitted to the Foreign Server if they correspond to integer types, for example:

#= EXPLAIN (ANALYZE, VERBOSE) SELECT id, name FROM osm_roads WHERE id = 1002;
  QUERY PLAN  
--------------------------------------------------------------------------------------------------------------------------
  Foreign Scan on public.osm_roads (cost=2925.00..2980.61 rows=5561 width=40) (actual time=15.849..16.410 rows=1 loops=1)
  Output: id, name
  Filter: (osm_roads.id = 1002)
  Informix costs: 2925.00
  Informix query: SELECT *, rowid FROM osm_roads WHERE (id = 1002)
  Total runtime: 17.145 ms
(6 rows)

 

Data Modification

With PostgreSQL® 9.3, the Informix FDW also supports manipulating (UPDATE), deleting (DELETE), and inserting (INSERT) data into the Foreign Table. Furthermore, transactions initiated in the PostgreSQL® client implicitly interact with transactions on the Foreign Server (provided that the Informix database was created with logging and is therefore transactional). The following example demonstrates this capability:

 

=# BEGIN;
BEGIN
 
*=# INSERT INTO osm_roads(id, name, highway) VALUES(55614, 'Hans-Mustermann-Straße', 'no');
INSERT 0 1
 
*=# SAVEPOINT A;
SAVEPOINT
 
*=# UPDATE osm_roads SET area = 'Nordrhein-Westfalen' WHERE id = 55614;
ERROR: value too long for type character varying(10)
 
!=# ROLLBACK TO A;
ROLLBACK
 
*=# UPDATE osm_roads SET area = 'NRW' WHERE id = 55614;
UPDATE 1
 
*=# COMMIT;
COMMIT
 
=# SELECT * FROM osm_roads WHERE id = 55614;
  id | name | highway | area  
-------+------------------------+---------+------
  55614 | Hans-Mustermann-Straße | no | NRW
(1 row)

 

The example first creates a new record after starting a transaction. Subsequently, a SAVEPOINT is set to secure the current state of this transaction. In the next step, the new record is modified because the locality of the new record was forgotten to be specified. However, since only country codes are allowed, this fails due to the excessive length of the new identifier, and the transaction becomes invalid. Via the SAVEPOINT, the transaction is rolled back to the last created savepoint, and the Informix FDW also implicitly rolls back to this SAVEPOINT on the Foreign Server. Subsequently, the correct country code can be entered within this transaction. After the COMMIT, the Informix FDW also confirms the transaction on the Foreign Server, and the record has been correctly entered.

Summary

Foreign Data Wrappers are a very powerful and flexible tool for integrating PostgreSQL® into heterogeneous database landscapes. These interfaces are by no means limited to purely relational data sources. Furthermore, with the support of modifying SQL queries (DML), the FDW API also enables the integration of writable data sources into PostgreSQL® databases.

This article was originally written by Bernd Helmle.

Categories: PostgreSQL®
Tags: FDW Informix PostgreSQL®

cR

About the author

credativ Redaktion

about the person

Dieser Account dient als Sammelpunkt für die wertvollen Beiträge ehemaliger Mitarbeiter von credativ. Wir bedanken uns für ihre großartigen Inhalte, die das technische Wissen in unserem Blog über die Jahre hinweg bereichert haben. Ihre Artikel bleiben hier weiterhin für unsere Leser zugänglich.

View posts


Beitrag teilen: