I wanted to create a table using from pyspark with the below two options (partion by and require filter) but I can't see an option to do this with the bigquery connector
This is how I would do it in BigQuery
CREATE dataset.table AS SELECT XXXX
PARTITION BY
DATE_TRUNC(collection_date, DAY) OPTIONS ( require_partition_filter = TRUE)
This is what I normally do
dataframe
.write
.format("bigquery")
.mode(mode)
.save(f"{dataset}.{table_name}")
You can use partitionField
, datePartition
, partitionType
For Clustering use - clusteredFields
See more options:
https://github.com/GoogleCloudDataproc/spark-bigquery-connector#properties
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.