Postgres 14 Internals: Data Organisation
Posted on 5 June, 2023
- A single PostgreSQL instance can serve several databases at a time; together they are called a database cluster.
- The directory that contains all the files related to the cluster is usually called
- After a cluster is intialised 3 databases are available.
You can find the list of dbs using
template0: used for restoring data from a logical backup or creating a database with a different encoding; it must never be modified.
template1: serves as a template for all the other databases that a user can create in the cluster.
postgres: a regular database that you can use at your discretion.
select * from pg_database.
- Metadata of all cluster objects (such as tables, indexes, data types, or functions) is stored in tables that belong to the system catalog.
- All system catalog tables begin with
pg_, like in
- Schemas are namespaces that store all objects of a database.
- Some predefined schemas include
public: default schema for user objects.
pg_catalog: used for system catalog tables.
information_schema: provides an alternative view for the system catalog as defined by the SQL standard.
pg_toast: for objects related to
pg_temp: comprises temporary tables.
- If the schema is not specified explicitly when an object is accessed, Postgres selects the first suitable schema from the search path.
- Tablespaces define physical data layout.
- A tablespace is a directory in a file system.
- One and the same tablespace can be used by different databases, and each database can store data in several tablespaces.
- Each database has the so-called default tablespace.
- You can find all tablespaces using the following query.select * from pg_tablespace;Sample output:
- All information associated with a relation is stored in several different files called forks, each containing data of a particular type.
- Its filename consists of a numeric ID (oid), which can be extended by a suffix that corresponds to the fork’s type.
- This file can grow over time, and when its size reaches 1 GB, another file of this fork is created (such files are sometimes called segments).
- A single relation (table) can be represented on disk by several files.
- All files are logically split into pages (or blocks) which is the minimum amount of data that can be read or written (8KB)
- The Oversized Attributes Storage Technique (TOAST) is used to store large values when their size overheads the maximum page size.
- If the main table contains potentially long attributes, a separate TOAST table is created for it right away, one for all the attributes. For example, if a table has a column of the numeric or
texttype, a TOAST table will be created even if this column will never store any long values.SELECTattname,atttypid::regtype,attstorage,CASE attstorageWHEN 'p' THEN 'plain'WHEN 'e' THEN 'external'WHEN 'm' THEN 'main'WHEN 'x' THEN 'extended'END AS storageFROMpg_attributeWHEREattrelid = 'tablename'::regclassAND attnum > 0;Here's a glimpse of the output of the above query.
- plain means that TOAST is not used (applied to “short” data-types such as the integer type).
- extended allows both compressing attributes and storing them in a separate TOAST table.
- external implies that long attributes are stored in the TOAST table in an uncompressed state.
- main requires long attributes to be compressed first; they will be moved to the TOAST table only if compression version did not help.
- TOAST tables reside in a separate schema called
pg_toast, but they are usually hidden. To find the TOAST table for a particular table, you can use the following query.SELECTreltoastrelid::regclassFROMpg_classWHEREoid = 'tablename'::regclass;Sample output:
- First process to start is the postmaster process, which is responsible for managing the cluster. If any process dies, the postmaster will restart it.
- To enable process interaction, postmaster allocates shared memory, which is avail- able to all the processes.
- postmaster process also listens for incoming connections. Once a new client appears, postmaster spawns a separate backend process. This involves authentication, then taking in SQL queries as text, parsing them, and executing them.