Postgres 14 Internals: Data Organisation

Posted on 5 June, 2023

A single PostgreSQL instance can serve several databases at a time; together they are called a database cluster.
The directory that contains all the files related to the cluster is usually called PGDATA.
After a cluster is intialised 3 databases are available.
1. template0: used for restoring data from a logical backup or creating a database with a different encoding; it must never be modified.
2. template1: serves as a template for all the other databases that a user can create in the cluster.
3. postgres: a regular database that you can use at your discretion.
You can find the list of dbs using select * from pg_database.

System Catalog

Metadata of all cluster objects (such as tables, indexes, data types, or functions) is stored in tables that belong to the system catalog.
All system catalog tables begin with pg_, like in pg_database.

Schemas

Schemas are namespaces that store all objects of a database.
Some predefined schemas include
- public: default schema for user objects.
- pg_catalog: used for system catalog tables.
- information_schema: provides an alternative view for the system catalog as defined by the SQL standard.
- pg_toast: for objects related to TOAST.
- pg_temp: comprises temporary tables.
If the schema is not specified explicitly when an object is accessed, Postgres selects the first suitable schema from the search path.

Tablespaces

Tablespaces define physical data layout.
A tablespace is a directory in a file system.
One and the same tablespace can be used by different databases, and each database can store data in several tablespaces.
Each database has the so-called default tablespace.
You can find all tablespaces using the following query.
```
select * from pg_tablespace;
```
Sample output:
oid spcname spcowner spcacl spcoptions
1663
pg_default
10
1664
pg_global
10

oid	spcname	spcowner	spcacl	spcoptions
1663	pg_default	10
1664	pg_global	10

Files, Forks & Pages

All information associated with a relation is stored in several different files called forks, each containing data of a particular type.
Its filename consists of a numeric ID (oid), which can be extended by a suffix that corresponds to the fork’s type.
This file can grow over time, and when its size reaches 1 GB, another file of this fork is created (such files are sometimes called segments).
A single relation (table) can be represented on disk by several files.
All files are logically split into pages (or blocks) which is the minimum amount of data that can be read or written (8KB)

Types of Forks

TODO

TOAST

The Oversized Attributes Storage Technique (TOAST) is used to store large values when their size overheads the maximum page size.
If the main table contains potentially long attributes, a separate TOAST table is created for it right away, one for all the attributes. For example, if a table has a column of the numeric or text type, a TOAST table will be created even if this column will never store any long values.
```
SELECT
  attname,
  atttypid::regtype,
  attstorage,
  CASE attstorage
    WHEN 'p' THEN 'plain'
    WHEN 'e' THEN 'external'
    WHEN 'm' THEN 'main'
    WHEN 'x' THEN 'extended'
  END AS storage
FROM
  pg_attribute
WHERE
  attrelid = 'tablename'::regclass
  AND attnum > 0;
```
Here's a glimpse of the output of the above query.
attname atttypid attstorage storage
a
uuid
p
plain
b
text
x
extended
c
jsonb
x
extended
d
timestamp with time zone
p
plain
e
text[]
x
extended
- plain means that TOAST is not used (applied to “short” data-types such as the integer type).
- extended allows both compressing attributes and storing them in a separate TOAST table.
- external implies that long attributes are stored in the TOAST table in an uncompressed state.
- main requires long attributes to be compressed first; they will be moved to the TOAST table only if compression version did not help.
TOAST tables reside in a separate schema called pg_toast, but they are usually hidden. To find the TOAST table for a particular table, you can use the following query.
```
SELECT
  reltoastrelid::regclass
FROM
  pg_class
WHERE
  oid = 'tablename'::regclass;
```
Sample output:
reltoastrelid
pg_toast.pg_toast_16415

attname	atttypid	attstorage	storage
a	uuid	p	plain
b	text	x	extended
c	jsonb	x	extended
d	timestamp with time zone	p	plain
e	text[]	x	extended

reltoastrelid
pg_toast.pg_toast_16415

Processes & Memory

First process to start is the postmaster process, which is responsible for managing the cluster. If any process dies, the postmaster will restart it.
To enable process interaction, postmaster allocates shared memory, which is avail- able to all the processes.
postmaster process also listens for incoming connections. Once a new client appears, postmaster spawns a separate backend process. This involves authentication, then taking in SQL queries as text, parsing them, and executing them.

PreviousList all values of a custom enum in Postgres NextSQLite db optimization with ROWID

Last updated 1 year ago