Question: What is the difference between a PostgreSQL cluster and a database?

Answer

In PostgreSQL, the terms "cluster" and "database" have specific meanings that are important to understand for effective database management.

PostgreSQL Cluster

A PostgreSQL cluster refers to a collection of databases that are managed by a single instance of the PostgreSQL server. A cluster includes the databases themselves along with other shared resources like roles (users), configuration files, and extensions. These shared resources are available across all databases within the same cluster.

The term "cluster" in PostgreSQL doesn't imply multiple servers or redundancy, but rather a group of databases managed by a single PostgreSQL server process. The physical data of a PostgreSQL cluster is stored in what's called the "data directory," which contains subdirectories for each database as well as transaction logs, configuration files, and other crucial metadata.

When you start a PostgreSQL server, you're essentially starting up a cluster. All databases within this cluster share the same background processes such as the writer, checker, and vacuum processes.

PostgreSQL Database

A PostgreSQL database, on the other hand, is a structured set of data held in a cluster. Each database is isolated from others within the same cluster in terms of data, schema, and permissions. This means that by default, a user who has access to one database cannot access another database without explicit permissions.

Databases within a cluster can be created, removed, and managed independently. However, some objects, known as global objects like roles or prepared statements, are shared across databases within the same cluster.

Example Usage

Consider you have a PostgreSQL installation for a company managing different types of applications. You might set up a separate database for each application within a single cluster. This way, each application operates independently in terms of data and schema but utilizes the same PostgreSQL instance:

-- Creating multiple databases in a single cluster CREATE DATABASE sales_app; CREATE DATABASE inventory_app; CREATE DATABASE hr_app;

Each of these databases sales_app, inventory_app, and hr_app would be part of the same PostgreSQL cluster but isolated from each other in terms of the data they hold.

Conclusion

Understanding the distinction between a PostgreSQL cluster and databases helps in effectively planning the architecture of your applications using PostgreSQL. It provides clarity on how data management, security, and resource sharing are handled within PostgreSQL environments.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book
Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.