简体   繁体   中英

Bigquery load job said successful but data did not get loaded into table

I submitted a Bigquery load job, it ran and returned with the status successful. But the data didn't make into the destintation table.

Here was the command that was run:

/usr/local/bin/bq load --nosynchronous_mode --project_id=ardent-course-601 --job_id=logsToBq_load_impressions_20140816_1674a956_6c39_4859_bc45_eb09db7ef99a --source_format=NEWLINE_DELIMITED_JSON dw_logs_impressions.impressions_20140816 gs://sm-uk-hadoop/queries/logsToBq_transformLogs/impressions/20140816/9307f6e3-0b3a-44ca-8571-7107c399998c/part* /opt/sm-analytics/projects/logsTobqMR/jsonschema/impressionsSchema.txt

I checked the job status of the job logsToBq_load_impressions_20140816_1674a956_6c39_4859_bc45_eb09db7ef99a. The input file count and size showed the correct number of input files and total size.

Does anyone know why the data didn't make into the table but yet the job is reported as successful?

Just in case this is not a mistake on our side, I ran the load job again but to a different destination table and this time the data made into the destination table fine.

Thank you.

This is very surprising, but I've confirmed via the logs that this is indeed the case.

Unfortunately, the detailed logs for this job, which ran on August 16, are no longer available. We're investigating whether this may have affected other jobs more recently. Please ping this thread if you see this issue again.

I experienced this recently with BigQuery in sandbox mode without a billing account. In this mode the partition expiration is automatically set to 60 days. If you load data into the table where the partitioned column(eg date) is older than 60 days it won't show up in the table. The load job still succeeds with the correct number of output rows.

we had this issue in our system and the reason was like table was set with partition expiry for 30 days and table was partitioned on timestamp column.. Hence when someone was ingesting data which is older than partition expiry date bigquery load jobs were successfully completed in Spark but we see no data in ingestion tables.. since it was getting deleted moment after it was ingested.. due to partition expiry set on.

Please check your bigquery table partition expiry parameters and see the partition column value of incoming data. If it value will be lower than partition expiry.. you wont see data in bigquery tables.. it will get deleted just after the ingestion.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM