简体   繁体   中英

BigQuery execution time and scaling

I created a test dataset of roughly 450GB in BigQuery and I am getting execution speed of ~9 seconds to query the largest table (10bn rows) when running from WebUI. I just wanted to check if this is a 'normal' expected result and whether it would get worse with larger size (ie 100bn rows+) and if the queries become more complex. I am aware of table partitioning/etc. but I just want to get a sense of what is 'normal' expected speed without first getting into optimization, since the above seems like 'smallish' size for what BQ is meant for.

The above result is achieved on a simple query like this:

select ColumnA from DataSet.Table order by ColumnB desc limit 100

So the result returned to the client is very small. ColumnA is structured as UUIDs represented in String format and ColumnB is integer.

It's almost impossible to say if this is " normal " or not. BigQuery is a multitenancy architecture/infrastructure. That means we all share the same resources (ie compute power) in the cluster when running queries. Therefore, query times are never deterministic in BigQuery ie they can vary depending on the number of concurrent queries executing from users at any given time. That said however, you can get reserved slots for a flat rate price . Although, you'd need to be spending quite a lot of money to justify that.

You can improve execution times by removing compute/shuffle/memory intensive steps like order by etc. Obviously, the complexity of the query will also have and impact on the query times.

On some of our projects we can smash through 3TB-5TB with a relatively complex query in about 15s-20s. Sometimes it quicker, sometimes is slower. We also run queries over much smaller datasets that can take the same amount of time. This is because what I wrote at the beginning - BigQuery query times are not deterministic.

Finally, BigQuery will cache results, so if you issue the same query multiple times over the same dataset it will be returned from the cache ie much quicker!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM