简体   繁体   中英

Parallel plans/queries in AWS Aurora Postgres

By parallel, I mean distributing the workload of a single (analytical) query to multiple threads or even replicas.

I see that OSS Postgres supports them since 9.6: https://www.postgresql.org/docs/9.6/static/parallel-plans.html

AWS Aurora has added preview of this, but only for the MySQL variant: https://aws.amazon.com/about-aws/whats-new/2018/02/amazon-aurora-parallel-query-is-available-for-preview/

But plain AWS RDS (not Aurora) does support them, by virtue of having Postgres 9.6: https://aws.amazon.com/blogs/database/performing-parallel-queries-and-phrase-searching-with-amazon-rds-for-postgresql-9-6-1/

  1. Am I correct in inferring that AWS Aurora Postgres does not support parallel plan?
  2. If so, does this mean that plain RDS Postgres may be more performant than the Aurora one for analytical queries?
  3. Any knowledge around future support of query parallelism for Aurora Postgres?

Many thanks!

Some explanations:

"parallel plans" in postgres9.6+ will do what you want: speed up a single query by kicking off parallel execution processes.

Aurora "parallel query" for MySQL is something completely different, although it also results in what you want.

Aurora (for MySQL and postgres) has a special distributed storage layer that keeps 6 redundant copies, which can be used for read-replicas and for failure recovery. "parallel query" leverages this storage layer (which comprises storage with associated CPUs to manage the storage) to perform some query computation, off-loading the DB VM.

For example, filtering out rows and cols not relevant to the query can be pushed down to the storage layer, instead of reading all the data back to the VM and discarding them there.

So in summary:

"parallel plans" in postgres9.6+ creates parallel execution processes inside the VM

"parallel query" in Aurora pushes down computation into the storage layer, offloading the VM.

Answers to your Qs:

  1. Correct
  2. Maybe: see above. Depends on the benefit you get by pushing down data filtering to the storage layer.
  3. According to AWS blog, "parallel query" for postgres is under development: "We are launching with support for MySQL 5.6, and are working on support for MySQL 5.7 and PostgreSQL." https://aws.amazon.com/blogs/aws/new-parallel-query-for-amazon-aurora/

I think some "significant" boost to postgres for analytics is necessary to support "real-time operational analytics" on a transactional postgres system. "parallel plans" in 9.6+ is a start. Aurora "parallel query" is another, different approach. There might be other approaches to speeding up analytics on postgres ... I'd like to see such solutions on multiple Clouds beyond AWS: Azure, GCP.

I have submitted a request for Azure - please upvote there if you agree:

https://feedback.azure.com/forums/597976-azure-database-for-postgresql/suggestions/35794984-transactional-db-with-analytics

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM