简体   繁体   中英

How to optimize BigQuery queries on a date field where most queries are at a year-month level

A question about BigQuery query performance on date fields...

I have a very large data.table where each record has an 'event date' field. Most of the queries on the table are actually run at a calendar month level, eg January 2020. Is there any BigQuery performance gain to be had from having an extra field(s) that store either 'year-month' as one field or 'year' and 'month' as two separate extra fields?

Have you partitioned your table by month already, if not, doing so will allow the queries to scan much less data (only the specified month). The partition-by-month feature went to GA just weeks ago:

September 21, 2020

The following time-unit partitioning features are now Generally Available (GA):

Creating partitions using hourly, monthly, and yearly time-unit granularities.

https://cloud.google.com/bigquery/docs/creating-column-partitions#daily_partitioning_vs_hourly_partitioning

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM