简体   繁体   中英

Long runtime when query is executed the first time in RedShift

I noticed that the first time I run a query on RedShift, it takes 3-10 second. When I run same query again, even with different arguments in WHERE condition, it runs fast (0.2 sec). Query I was talking about runs on a table of ~1M rows, on 3 integer columns.

Is this huge difference in execution times caused by the fact that RedShift compiles the query first time its run, and then re-uses the compiled code?

If yes - how to always keep this cache of compiled queries warm?

One more question: Given queryA and queryB. Let's assume queryA was compiled and executed first. How similar should queryB be to queryA, such that execution of queryB will use the code compiled for queryA?

The answer of first question is yes. Amazon Redshift compiles code for the query and cache it. The compiled code is shared across sessions in a cluster, so the same query with even different parameters in the different session will run faster because of no overhead.

Also they recommend to use the result of the second execution of the query for the benchmark.

There is the answer for this question and details in the following link. http://docs.aws.amazon.com/redshift/latest/dg/c-compiled-code.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM