Postgres 14 Internals: Data Organisation
Posted on 5 June, 2023
A single PostgreSQL instance can serve several databases at a time; together they are called a database cluster.
The directory that contains all the files related to the cluster is usually called
PGDATA
.After a cluster is intialised 3 databases are available.
template0
: used for restoring data from a logical backup or creating a database with a different encoding; it must never be modified.template1
: serves as a template for all the other databases that a user can create in the cluster.postgres
: a regular database that you can use at your discretion.
You can find the list of dbs using
select * from pg_database
.
System Catalog
Metadata of all cluster objects (such as tables, indexes, data types, or functions) is stored in tables that belong to the system catalog.
All system catalog tables begin with
pg_
, like inpg_database
.
Schemas
Schemas are namespaces that store all objects of a database.
Some predefined schemas include
public
: default schema for user objects.pg_catalog
: used for system catalog tables.information_schema
: provides an alternative view for the system catalog as defined by the SQL standard.pg_toast
: for objects related toTOAST
.pg_temp
: comprises temporary tables.
If the schema is not specified explicitly when an object is accessed, Postgres selects the first suitable schema from the search path.
Tablespaces
Tablespaces define physical data layout.
A tablespace is a directory in a file system.
One and the same tablespace can be used by different databases, and each database can store data in several tablespaces.
Each database has the so-called default tablespace.
You can find all tablespaces using the following query.
Sample output:
oid spcname spcowner spcacl spcoptions 1663
pg_default
10
1664
pg_global
10
Files, Forks & Pages
All information associated with a relation is stored in several different files called forks, each containing data of a particular type.
Its filename consists of a numeric ID (oid), which can be extended by a suffix that corresponds to the fork’s type.
This file can grow over time, and when its size reaches 1 GB, another file of this fork is created (such files are sometimes called segments).
A single relation (table) can be represented on disk by several files.
All files are logically split into pages (or blocks) which is the minimum amount of data that can be read or written (8KB)
Types of Forks
TODO
TOAST
The Oversized Attributes Storage Technique (TOAST) is used to store large values when their size overheads the maximum page size.
If the main table contains potentially long attributes, a separate TOAST table is created for it right away, one for all the attributes. For example, if a table has a column of the numeric or
text
type, a TOAST table will be created even if this column will never store any long values.Here's a glimpse of the output of the above query.
attname atttypid attstorage storage a
uuid
p
plain
b
text
x
extended
c
jsonb
x
extended
d
timestamp with time zone
p
plain
e
text[]
x
extended
plain means that TOAST is not used (applied to “short” data-types such as the integer type).
extended allows both compressing attributes and storing them in a separate TOAST table.
external implies that long attributes are stored in the TOAST table in an uncompressed state.
main requires long attributes to be compressed first; they will be moved to the TOAST table only if compression version did not help.
TOAST tables reside in a separate schema called
pg_toast
, but they are usually hidden. To find the TOAST table for a particular table, you can use the following query.Sample output:
reltoastrelid pg_toast.pg_toast_16415
Processes & Memory
First process to start is the postmaster process, which is responsible for managing the cluster. If any process dies, the postmaster will restart it.
To enable process interaction, postmaster allocates shared memory, which is avail- able to all the processes.
postmaster process also listens for incoming connections. Once a new client appears, postmaster spawns a separate backend process. This involves authentication, then taking in SQL queries as text, parsing them, and executing them.
Last updated