[英]Bigquery bq load Internal Error
Background背景
I am trying to load a json file x.json using bq cli.我正在尝试使用 bq cli 加载一个 json 文件 x.json。
cat x.json猫 x.json
{"name":"xyz","mobile":"xxx","location":"abc"} {"name":"xyz","mobile":"xxx","location":"abc"}
{"name":"xyz","mobile":"xxx","age":"22"} {"name":"xyz","mobile":"xxx","age":"22"}
Command Used使用的命令
bq load --autodetect --source_format=NEWLINE_DELIMITED_JSON project:test_datasets.cust x.json bq load --autodetect --source_format=NEWLINE_DELIMITED_JSON 项目:test_datasets.cust x.json
' cust ' is a table with empty schema. ' cust ' 是一个具有空架构的表。
I am using ' --autodetect ,so that BigQuery autodetects schema.我正在使用 ' --autodetect ,以便 BigQuery 自动检测架构。
Output输出
Upload complete.上传完成。
Waiting on bqjob_r475558282b85c552_000001569cf1efd8_1 ... (1s) Current status: DONE等待 bqjob_r475558282b85c552_000001569cf1efd8_1 ... (1s) 当前状态:DONE
BigQuery error in load operation: Error processing job 'project:bqjob_r475558282b85c552_000001569cf1efd8_1': An internal error occurred and the request could not be completed.加载操作中的 BigQuery 错误:处理作业“项目:bqjob_r475558282b85c552_000001569cf1efd8_1”时出错:发生内部错误,无法完成请求。
Any thoughts on ,why Internal error occurs and how to resolve it?关于为什么会发生内部错误以及如何解决它的任何想法?
We seen several problems:我们看到了几个问题:
For all these we opened cases in paid Google Enterprise Support, but unfortunately they didn't resolved it.对于所有这些,我们在付费的 Google Enterprise Support 中打开了案例,但不幸的是他们没有解决。 It seams the recommended option to take is an exponential-backoff with retry , even the support told to do so.
它接缝推荐的选项是使用 retry 的指数退避,即使支持人员也被告知这样做。 Also the failure rate fits the 99.9% uptime we have in the SLA, so there is no reason for objection.
此外,故障率符合 SLA 中 99.9% 的正常运行时间,因此没有理由反对。
There's something to keep in mind in regards to the SLA, it's a very strictly defined structure, the details are here .关于 SLA,需要记住一些事情,它是一个非常严格定义的结构,详细信息在这里。 The 99.9% is uptime not directly translated into fail rate.
99.9% 是正常运行时间并不能直接转化为故障率。 What this means is that if BQ has a 30 minute downtime one month, and then you do 10,000 inserts within that period but didn't do any inserts in other times of the month, it will cause the numbers to be skewered.
这意味着,如果 BQ 在一个月内有 30 分钟的停机时间,然后您在该期间内进行了 10,000 次插入,但在该月的其他时间没有进行任何插入,则会导致数字出现偏差。 This is why we suggest a exponential backoff algorithm.
这就是我们建议指数退避算法的原因。 The SLA is explicitly based on uptime and not error rate, but logically the two correlates closely if you do streaming inserts throughout the month at different times with backoff-retry setup.
SLA 明确基于正常运行时间而不是错误率,但如果您在整个月的不同时间使用退避重试设置进行流式插入,则两者在逻辑上密切相关。 Technically, you should experience on average about 1/1000 failed insert if you are doing inserts through out the month if you have setup the proper retry mechanism.
从技术上讲,如果您设置了正确的重试机制,那么如果您在整个月进行插入操作,您平均应该会遇到大约 1/1000 的插入失败。
You can check out this chart about your project health: https://console.developers.google.com/project/YOUR-APP-ID/apiui/apiview/bigquery?tabId=usage&duration=P1D您可以查看有关项目运行状况的图表: https : //console.developers.google.com/project/YOUR-APP-ID/apiui/apiview/bigquery?tabId=usage&duration=P1D
Try with this:试试这个:
bq \
--project_id your_id_proyect \
--location=US \
load \
--autodetect \
--source_format=NEWLINE_DELIMITED_JSON \
'name_of_your_table' \
x.json
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.