简体   繁体   中英

Redshift Spectrum - Referencing an external table in a CTE?

I'm trying to make some data available via Redshift Spectrum to our reporting platform. I chose Spectrum because it offers lower latency to our data lake vs a batched ETL process.

One of the queries I have looks like this

with txns as (select * from spectrum_table where ...)
select field1, field2, ...
from txns t1
left join txns t2 on t2.id = t1.id
left join txns t3 on t3.id = t1.id
where...

Intuitively, this should cache the Spectrum query output in-memory with the CTE, and make it available to query later in query without hitting S3 a second (or third) time.

However, I checked the explain plan, and with each join the number of "S3 Seq Scan"s goes up by one. So it appears to do the Spectrum scan each time the CTE is queried.

Questions:

  1. Is this actually happening? Or is the explain plan wrong? The run-time of this query doesn't appear to increase linearly with the number of joins, so it's hard to tell.

  2. If it is happening, what other options are there to achieve this sort of result? Other than manually creating a temp table (this will be accessed by a reporting tool, so I'd prefer to avoid allowing explicit write access or requiring multiple statements to get the data)

Thanks!

  1. Yes this is really happening. CTE references are not reused - this is due to the possibility that different data will be used in the different references. Applying where clauses at table scan is an important performance feature.

  2. You could look into using a materialized view but I expect that you are dynamically applying the where clauses in the CTE so this may not match you need. If it was me I'd want to understand why the triple self-join. Seems like there may be a better way to construct the query but it is just a gut feel.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM