简体   繁体   中英

Google BigQuery WRITE_TRUNCATE erasing all data

I have a table setup in BQ where if I write data that exists on a certain date partition I want it to overwrite. I've set up the job_config to use WRITE_TRUNCATE.

#file_obj = Some ndjson StringIO file like obj

job_config = bigquery.QueryJobConfig()
# Set configuration.query.destinationTable
dest_dataset = 'test'
dest_table_name = 'sales_data'
destination_dataset = client.dataset(dest_dataset)
destination_table = destination_dataset.table(dest_table_name)
job_config.destination = destination_table

# Set configuration.query.writeDisposition & SourceFormat
job_config.write_disposition = 'WRITE_TRUNCATE'
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON

# Set partitioning
time_partitioning = bigquery.table.TimePartitioning(
    bigquery.table.TimePartitioningType.DAY, 'date'
)
job_config.time_partitioning = time_partitioning

# Start the load job
job = client.load_table_from_file(
        file_obj, destination_table,
        job_config=job_config
)
# Wait for the job to finish
job.result()

However, I noticed that when I backfilled data it always overwrites all data in the table even if the date partition is different. For example if I have data in the table from 20190101-20190201 and I load data from 20190202-Present my whole table gets erased and it only includes the new data. Shouldn't this data remain preserved since it's on a different partition date? Any idea why this is happening or if I'm missing something?

Any idea why this is happening or if I'm missing something?

job_config.write_disposition = 'WRITE_TRUNCATE' is the whole table scope action - and says If the table already exists - overwrites the table data. This does not consider any partitioning and affects the whole table

If you need to overwrite specific partition you need to specifically reference this partition - for example as sales_data$20190202

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM