Question: What is the difference between a PostgreSQL cluster and an instance?


In PostgreSQL, the terms "cluster" and "instance" are often used interchangeably in general database discussions, but they have distinct meanings within the context of PostgreSQL that are important to understand.

PostgreSQL Instance

A PostgreSQL instance refers to a single running postgres process on a host machine. This process can manage multiple databases. It is started with a specific data directory (PGDATA) which contains all the files, databases, and configurations specific to that instance. The instance is where PostgreSQL's background processes operate, including handling queries, transactions, and connections.

PostgreSQL Cluster

The term cluster, in PostgreSQL, does not refer to multiple servers working together (as it might in some other database systems). Instead, a PostgreSQL cluster refers to a collection of databases that are managed by a single PostgreSQL instance. These databases share the same PostgreSQL instance configuration settings and are stored in the same file system structure. Within a cluster, databases can share access to common resources like roles, extensions, and background workers, but each maintains its own set of tables, views, and other data objects.

Key Differences

  • Granularity: An instance is a broader concept as it includes the server process and the environment in which databases operate. A cluster refers specifically to the group of databases managed by one instance.
  • Scalability: While PostgreSQL does not use clusters for horizontal scaling across multiple machines (you would need external tools like Citus or Postgres-XL for this), managing multiple clusters can help isolate and manage resources more effectively within the same PostgreSQL setup on a single server.
  • Configuration: System-level settings are configured at the instance level (postgresql.conf, pg_hba.conf), affecting all clusters within that instance. Meanwhile, operational aspects like database creation and user permissions are managed at the cluster level.

Understanding these distinctions is crucial when planning database architectures, performing backups, setting up replication, or configuring multi-tenant environments in PostgreSQL.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book
Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.