I have the below EXPORT SQL command that runs successfully. However, it generates 22 files of 0 bytes. The SQL is correct. No data should return. That's not my problem. The issue lies in why does the export still results in 22 exported files in GCS? The expectation is that if there's no returns, no files should be created.
How do I stop that? Thank you.
EXPORT DATA OPTIONS (
uri = 'gs://<<BUCKET>>/<<TABLE>>*.csv',
format = 'CSV',
overwrite = true,
header = false,
field_delimiter = '|'
) AS
SELECT DISTINCT * FROM `<<PROJECT>>.<<DATASET>>.VWE_<<TABLE>>` where cast(LASTLOADDATE as datetime) > DATETIME_SUB(CURRENT_DATE, INTERVAL 2 DAY) and LASTLOADDATE is not null;
Unfortunately this is a normal behaviour with BigQuery
export using a wildcard in the uri.
BigQuery
shards your data into multiple files based on the provided pattern. The size of the exported files will vary: doc
Even if there is no result in the query, with wildcard, BigQuery
can generate multiple empty files.
If it's mandatory in your case to delete empty files, you can create a dedicated Shell
script to remove them, example:
# check file size with
gsutil du -s -a gs://bucket/kitten.png
# remove files with
gsutil rm gs://bucket/kitten.png
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.