What are columns generated in PostgreSQL 12?
PostgreSQL is known for its almost fanatical pursuit of database standards implementation. With the release of version 12, another step was taken towards making them available as fully as possible in the Open Source model. In this text we will focus on generated columns, also called virtual columns, which allow you to use the latest […]
PostgreSQL is known for its almost fanatical pursuit of database standards implementation. With the release of version 12, another step was taken towards making them available as fully as possible in the Open Source model. In this text we will focus on generated columns, also called virtual columns, which allow you to use the latest edition of one of the most popular database engine developers – PostgreSQL 12.
At the beginning it is worth explaining what columns are generated ( generated columns). paraphrasing SQL standard: 2003, the column generated in the table is one whose values will be the result of an expression whose components are other columns. It sounds complicated, so it’s worth showing it in practice, using an example using a pseudo-language (we’ll go to SQL later).
ID | net_price | tax_rate | price |
---|---|---|---|
01 | 10.00 | 23% | 12.30 |
02 | 123.00 | 23% | 151.29 |
03 | 39.99 | 8% | 43.19 |
As you can see, for such a field value table price
they simply result from the value of other fields. In this case, the expression in the pseudo-language could look like this:
price: = net_price + net_price * tax_rate
Conceptually you can see that the column generated in relation to other columns is the same as the view for tables.
By storing only information on how to generate one of the columns in the database, instead of the results of this action, you can save quite a large amount of space (for suitably large data sets). Of course, like most compromises in computer science, this entails an increased demand for computing resources. This way you can see probably why some manufacturers decided to change the standard name and use the concept of a virtual column, because its data is not necessarily on disk.
Columns generated in PostgreSQL 12
Let’s get to the point and see how the latest Postgres allows you to use this function. According to documentation a clause is responsible for creating this type of columns GENERATED ALWAYS AS
which should be used when creating the table. Take the previous example to the workshop:
CREATE TABLE product_prices ( ... net_price NUMERIC, tax_rate NUMERIC, price NUMERIC GENERATED ALWAYS AS (netto_price + net_price * tax_rate) STORED, );
It is worth mentioning the key word here STORED
. This means that the generated value will be stored on disk. At the moment, this is also the only available form of creating generated columns, which does not mean that the next versions of Postgres will not allow operations on quite „virtual ” columns. Already, however, it is a very convenient way to directly point to the links between data already at the stage of database design. This allows you to order the „database to ensure ” update in those places where we care. For now, however, we can wait for full implementation, which will also allow you to take advantage of VIRTUAL
. However, we can already use this function, e.g. when changing data:
postgres=# SELECT * FROM product_prices;
id | netto_price | tax_rate | price
----+-------------+----------+----------
1 | 10.00 | 0.23 | 12.3000
2 | 123.00 | 0.23 | 151.2900
3 | 39.99 | 0.08 | 43.1892
(3 rows)
postgres=# UPDATE product_prices SET netto_price = 35.99 WHERE id = 3;
UPDATE 1
postgres=# SELECT * FROM product_prices;
id | netto_price | tax_rate | price
----+-------------+----------+----------
1 | 10.00 | 0.23 | 12.3000
2 | 123.00 | 0.23 | 151.2900
3 | 35.99 | 0.08 | 38.8692
(3 rows)
Generated and identity columns
It is worth mentioning here the so-called identity column, i.e. a mechanism that allows you to define columns whose values are part of a certain sequence. This is most often used in columns that are the record identification number. In older versions of PostgreSQL, the recommended method for defining columns ID
was to use a special type SERIAL
. Best practice since release 10 for new projects is the use of identity columns. I mention them here because of their syntax, which is very similar. Using our example, the full table definition should look like this:
CREATE TABLE product_prices ( id INT GENERAL ALWAYS AS IDENTITY, net_price NUMERIC, tax_rate NUMERIC, price NUMERIC GENERATED ALWAYS AS (netto_price + net_price * tax_rate) STORED, );
Examples of using generated columns
Because the columns are generated by a special being in the world of Postgres, they can be used in a rather interesting way. One such application is to control the visibility of individual columns, because the generated columns receive separate permissions from the columns on which they are based:
postgres => REVOKE ALL ON product_prices FROM eurolinux; REVOKE postgres => GRANT SELECT (id, net_price, tax_rate) ON product_prices TO eurolinux; GRANT postgres => SELECT * FROM product_prices; ERROR: permission denied for table product_prices
As you can see, access to base columns does not allow you to read the generated column as well. The situation is the opposite, access to the generated column does not allow you to see the values from which it results.
The possibility of using it in everyday practice can be a very good way to simplify some queries or to give up triggers in some cases. This will definitely simplify management and develop the databases used.
Summary
As you can see, a relatively simple topic can lead to quite long considerations. Thank you for the time spent reading this text and I encourage you to subscribe to our newsletter.