| Categories: | PostgreSQL® |
|---|---|
| Tags: | credativ News Open Source PostgreSQL® |
The PostgreSQL® Global Development Group (PGDG) has released version 12 of the popular, free database PostgreSQL®. As our Beta 4 article already suggested, a multitude of new features, improvements, and optimizations have been incorporated into the release. These include:
btree indexes, the standard index type in PostgreSQL®, have received several optimizations in PostgreSQL® 12.
Historically, btree indexes stored duplicates (i.e., multiple entries with the same key values) in an unsorted order. This resulted in suboptimal utilization of the physical representation in the affected indexes. An optimization now stores these multiple key values in the same order as they are physically stored in the table. This improves storage utilization and reduces the overhead for managing corresponding btree indexes. Furthermore, indexes with multiple indexed columns use an improved physical representation, which also enhances their storage utilization. To benefit from this in PostgreSQL® 12, these indexes must be rebuilt or reindexed if the upgrade to the new version was performed via binary upgrade using pg_upgrade.
Insert operations into btree indexes are also accelerated by improved locking.
credativ has contributed an extension for pg_checksums that allows activating or deactivating block checksums in stopped PostgreSQL® instances. Previously, this could only be done by recreating the physical data representation of the cluster via initdb.
pg_checksums now also features the ability to display a status history on the console with the parameter --progress. The corresponding code contributions come from colleagues Michael Banck and Bernd Helmle.
Up to and including PostgreSQL® 11, the PostgreSQL® Optimizer was unable to optimize Common Table Expressions (also known as CTEs or WITH queries). If such an expression was used in a query, the CTE was always evaluated and materialized first before the rest of the query was processed. For more complex CTE expressions, this led to correspondingly expensive execution plans. The following generic example clearly illustrates this. Given a join with a CTE expression that filters all even numbers from a numeric column:
WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id);
In PostgreSQL® 11, the use of a CTE always leads to a CTE Scan, which materializes the CTE expression first:
EXPLAIN (ANALYZE, BUFFERS) WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id) ; QUERY PLAN ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Aggregate (cost=2231.12..2231.14 rows=1 width=8) (actual time=48.684..48.684 rows=1 loops=1) Buffers: shared hit=488 CTE t_cte -> Seq Scan on foo (cost=0.00..1943.00 rows=500 width=4) (actual time=0.055..17.146 rows=50000 loops=1) Filter: ((id % 2) = 0) Rows Removed by Filter: 50000 Buffers: shared hit=443 -> Hash Join (cost=270.00..286.88 rows=500 width=0) (actual time=7.297..47.966 rows=5000 loops=1) Hash Cond: (t_cte.id = bar.id) Buffers: shared hit=488 -> CTE Scan on t_cte (cost=0.00..10.00 rows=500 width=4) (actual time=0.063..31.158 rows=50000 loops=1) Buffers: shared hit=443 -> Hash (cost=145.00..145.00 rows=10000 width=4) (actual time=7.191..7.192 rows=10000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 480kB Buffers: shared hit=45 -> Seq Scan on bar (cost=0.00..145.00 rows=10000 width=4) (actual time=0.029..3.031 rows=10000 loops=1) Buffers: shared hit=45 Planning Time: 0.832 ms Execution Time: 50.562 ms (19 rows)
This plan first materializes the CTE with a Sequential Scan with the corresponding filter (id % 2 = 0). No functional index is used here, so this scan is correspondingly more expensive. Afterwards, the result of the CTE is joined with the table bar via Hash Join with the corresponding Join condition. With PostgreSQL® 12, the optimizer now gains the ability to inline these CTE expressions without prior materialization. The underlying, optimized plan in PostgreSQL® 12 then looks like this:
EXPLAIN (ANALYZE, BUFFERS) WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id) ; QUERY PLAN ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Aggregate (cost=706.43..706.44 rows=1 width=8) (actual time=9.203..9.203 rows=1 loops=1) Buffers: shared hit=148 -> Merge Join (cost=0.71..706.30 rows=50 width=0) (actual time=0.099..8.771 rows=5000 loops=1) Merge Cond: (foo.id = bar.id) Buffers: shared hit=148 -> Index Only Scan using foo_id_idx on foo (cost=0.29..3550.29 rows=500 width=4) (actual time=0.053..3.490 rows=5001 loops=1) Filter: ((id % 2) = 0) Rows Removed by Filter: 5001 Heap Fetches: 10002 Buffers: shared hit=74 -> Index Only Scan using bar_id_idx on bar (cost=0.29..318.29 rows=10000 width=4) (actual time=0.038..3.186 rows=10000 loops=1) Heap Fetches: 10000 Buffers: shared hit=74 Planning Time: 0.646 ms Execution Time: 9.268 ms (15 rows)
The advantage of this method is that the initial materialization of the CTE expression is omitted. Instead, the query is executed directly with a Join.
This works for all non-recursive CTE expressions without side effects (e.g., CTEs with write statements) and those that are referenced only once per query. The old optimizer behavior can be forced with the WITH ... AS MATERIALIZED ... statement.
New in PostgreSQL ® 12 are so-called Generated Columns, which calculate a result based on existing column values using expressions. These are stored with the corresponding source values in the tuple. The advantage is that the creation of triggers for subsequent calculation of column values can be avoided. The following simple example, based on a price table with net and gross prices, illustrates the new functionality:
CREATE TABLE preise(netto numeric, brutto numeric GENERATED ALWAYS AS (netto * 1.19) STORED); INSERT INTO preise VALUES(17.30); INSERT INTO preise VALUES(225); INSERT INTO preise VALUES(247); INSERT INTO preise VALUES(19.15); SELECT * FROM preise; netto │ brutto ───────┼───────── 17.30 │ 20.5870 225 │ 267.75 247 │ 293.93 19.15 │ 22.7885 (4 rows)
The brutto column is calculated directly from the net price. The keyword STORED is mandatory. Of course, indexes can also be created on Generated Columns, but they cannot be part of a primary key. Furthermore, the SQL expression must be unambiguous, i.e., it must yield the same result even with the same input set. Columns declared as Generated Columns cannot be explicitly used in INSERT or UPDATE operations. If a column list is absolutely necessary, the corresponding value can be indirectly referenced with the keyword DEFAULT.
Explicit OID columns were historically a way to create unique column values so that a table row could be uniquely identified database-wide. For a long time, however, these have only been explicitly created in PostgreSQL®, and their fundamental functionality is considered obsolete. With PostgreSQL®, the ability to explicitly create such columns is now finally abolished. This means it will no longer be possible to specify the WITH OIDS directive for tables. System tables, which have always uniquely referenced objects by OID, will now return OID values without explicit specification of OID columns in the result set. Older software, in particular, that carelessly handled catalog queries, could encounter problems due to duplicate column output.
recovery.conf to the postgresql.confUp to and including PostgreSQL® 11, database recovery and streaming replication instances were configured via a separate configuration file recovery.conf.
With PostgreSQL® 12, all configuration work previously done there now moves to the postgresql.conf. The recovery.conf file is no longer used. PostgreSQL® 12 refuses to start if this file is present. Whether recovery or a streaming standby is desired is now decided either by a recovery.signal file (for recovery) or standby.signal (for standby systems). The latter takes precedence if both files are present. The old parameter standby_mode, which previously controlled this behavior, has been removed.
For automatic deployments of high-availability systems, this represents a major change. However, it is now also possible to perform almost all corresponding configuration work via ALTER SYSTEM, so that only one configuration file needs to be maintained.
With PostgreSQL® 12, there is now a way to rebuild indexes with as few locks as possible. This significantly simplifies one of the most common maintenance tasks in very write-intensive databases. Previously, a combination of CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY had to be used. Care also had to be taken to assign new index names accordingly.
The Release Notes provide an even more detailed overview of all new features and, especially, incompatibilities compared to previous PostgreSQL® versions.
This article was originally written by Bernd Helmle.
| Categories: | PostgreSQL® |
|---|---|
| Tags: | credativ News Open Source PostgreSQL® |
About the author
about the person
This account serves as a hub for the valuable contributions of former credativ employees. We would like to thank them for their great content, which has enriched the technical knowledge on our blog over the years. Their articles remain accessible to our readers here.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Brevo. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou need to load content from Turnstile to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Turnstile. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information