简体繁体中英

How to improve frequent BigQuery reads?

原文 2021-04-16 06:07:07 3 2 google-bigquery

I'm using BigQuery for Java to do small reads on a table with about ~5GB of data. The queries I do follows the most standard SQL like SELECT foo FROM my-table WHERE bar=$1 where the result will be at most 1 row. I need to do this at a high frequency and therefore performance is a big concern. How do I optimize for this?

I thought about pulling the entire data set periodically since it's only 5GB, but then again 5GB sounds like a lot to be constantly keeping in memory.

Running this query in BigQuery console shows something like Query complete (0.6 sec elapsed, 4.2 GB processed) . Fast for 4.2 GB but not fast enough. Again, I need to very frequently read from it but rarely (maybe once a day or week) write to it.

Maybe tell the server to cache the processed data somehow?

2 answers

You don't have control over the Cache layer in BigQuery. That is something the service does automatically for you. Unfortunately typical cache lifetime is 24 hours, and the cached results are best-effort and may be invalidated sooner (Official docs ).

Query completes in 0.6s seems to be goo for BQ. I'm afraide that If you are looking for something faster maybe BigQuery isn't the data warehouse for your use case.

BigQuery is built for analytical processing and not to interact with individual rows. The best practice would be as you mentioned to hold a copy of it in a place that allows quicker and more efficient reading of individual rows (like a MySQL database).

However, you can still vastly optimize the amount of data scanned in your query by clustering the table on the field that you're filtering on.

https://cloud.google.com/bigquery/docs/creating-clustered-tables

How to improve BigQuery read performance

Bigquery : Frequent Updates to a record

How to get the most frequent value in Google's Bigquery

How to get the most frequent value per group with number of occurrences in BigQuery?

How to improve performance of ST_INTERSECT in BigQuery?

How to improve performance of GeoIP query in BigQuery?

How to use BigQuery to improve Cloud ML training in GCP?

How can i use group by a column 1 and get the most frequent occurance of column 2 in the same output in bigquery

how to optimize google-bigquery for finding most frequent categories from big data table?

Finding the most frequent value of string using BigQuery

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to improve BigQuery read performance Bigquery : Frequent Updates to a record How to get the most frequent value in Google's Bigquery How to get the most frequent value per group with number of occurrences in BigQuery? How to improve performance of ST_INTERSECT in BigQuery? How to improve performance of GeoIP query in BigQuery? How to use BigQuery to improve Cloud ML training in GCP? How can i use group by a column 1 and get the most frequent occurance of column 2 in the same output in bigquery how to optimize google-bigquery for finding most frequent categories from big data table? Finding the most frequent value of string using BigQuery

Related Tags

How to improve frequent BigQuery reads?

Question

2 answers

solution1
0 2021-04-16 06:55:23

solution2
0 2021-04-16 19:19:54

How to improve frequent BigQuery reads?

Question

2 answers

solution1 0 2021-04-16 06:55:23

solution2 0 2021-04-16 19:19:54

solution1
0 2021-04-16 06:55:23

solution2
0 2021-04-16 19:19:54