Introducing Dragonfly Cloud! Learn More

Question: How do you partition a table by multiple columns in PostgreSQL?

Answer

Table partitioning in PostgreSQL is a technique used to divide a large table into smaller, more manageable pieces, called partitions, based on the values of one or more columns. This can significantly improve performance for queries and data maintenance operations. When partitioning by multiple columns, PostgreSQL supports range, list, and hash partitioning. A common use case might be partitioning a sales data table both by region (list) and by year (range).

Here's how you can partition a table by multiple columns using range partitioning as an example:

  1. Create the parent table: First, you need to create the parent table with the PARTITION BY clause specifying the partitioning strategy and the columns you're partitioning by.
CREATE TABLE sales ( sale_id serial NOT NULL, region text NOT NULL, sale_date date NOT NULL, amount numeric NOT NULL ) PARTITION BY RANGE (sale_date, region);

In this example, we're partitioning the sales table by range on sale_date and region.

  1. Create partition tables: Next, you define the partitions. Each partition will hold the rows for specific values or ranges of the partition key(s).
CREATE TABLE sales_2020_north PARTITION OF sales FOR VALUES FROM ('2020-01-01', 'north') TO ('2021-01-01', 'north'); CREATE TABLE sales_2021_north PARTITION OF sales FOR VALUES FROM ('2021-01-01', 'north') TO ('2022-01-01', 'north');

These commands create two partitions: one for sales in the 'north' region during 2020, and another for sales in the same region during 2021. You would continue defining partitions for other regions and time frames as needed.

  1. Insert data into the parent table: When you insert data into the parent sales table, PostgreSQL automatically routes the data to the correct partition based on the partitioning rules.
INSERT INTO sales (region, sale_date, amount) VALUES ('north', '2020-06-15', 100);

This record would be stored in the sales_2020_north partition.

Partitioning by multiple columns can add complexity but offers flexibility for optimizing performance based on your specific query patterns. It's important to design your partitioning scheme thoughtfully, considering how your data is accessed and updated.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.