简体   繁体   中英

Saving partitioned table with BigQuery Spark connector

I wanted to create a table using from pyspark with the below two options (partion by and require filter) but I can't see an option to do this with the bigquery connector

This is how I would do it in BigQuery

CREATE dataset.table AS SELECT XXXX 
PARTITION BY
  DATE_TRUNC(collection_date, DAY) OPTIONS ( require_partition_filter = TRUE)

This is what I normally do

    dataframe
        .write
        .format("bigquery")
        .mode(mode)
        .save(f"{dataset}.{table_name}")

You can use partitionField , datePartition , partitionType

For Clustering use - clusteredFields

See more options:

https://github.com/GoogleCloudDataproc/spark-bigquery-connector#properties

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM