简体   繁体   English

BigQuery:将数据从GCS加载到EU数据集中

[英]BigQuery: Load Data into EU Dataset from GCS

In the past I have successfully loaded data into US-hosted BigQuery datasets from CSV data in US-hosted GCS buckets. 过去,我已成功将数据从美国托管的GCS存储桶中的CSV数据加载到美国托管的BigQuery数据集中。 We since decided to move our BigQuery data to the EU and I created a new dataset with this region selected on it. 此后,我们决定将BigQuery数据移至EU,然后我创建了一个新的数据集,并在其中选择了该区域。 I have successfully populated those of our tables small enough to be uploaded from my machine at home. 我已经成功填充了那些表,这些表足够小,可以从家里的机器上载。 But two tables are far too large for this so I would like to load them from files in GCS. 但是两个表太大了,因此我想从GCS中的文件中加载它们。 I have tried doing this from both a US-hosted GCS bucket and an EU-hosted GCS bucket (thinking that bq load might not like to cross regions) but the load fails every time. 我曾尝试从美国托管的GCS存储桶和欧盟托管的GCS存储桶中执行此操作(认为bq load可能不希望跨区域),但是负载每次都会失败。 Below is the error detail I'm getting from the bq command line (500, Internal Error). 以下是我从bq命令行获取的错误详细信息(500,内部错误)。 Does anyone know a reason why this might be happening? 有谁知道发生这种情况的原因?

{
  "configuration": {
    "load": {
      "destinationTable": {
        "datasetId": "######", 
        "projectId": "######", 
        "tableId": "test"
      }, 
      "schema": {
        "fields": [
          {
            "name": "test_col", 
            "type": "INTEGER"
          }
        ]
      }, 
      "sourceFormat": "CSV", 
      "sourceUris": [
        "gs://######/test.csv"
      ]
    }
  }, 
  "etag": "######", 
  "id": "######", 
  "jobReference": {
    "jobId": "######", 
    "projectId": "######"
  }, 
  "kind": "bigquery#job", 
  "selfLink": "https://www.googleapis.com/bigquery/v2/projects/######", 
  "statistics": {
    "creationTime": "1445336673213", 
    "endTime": "1445336674738", 
    "startTime": "1445336674738"
  }, 
  "status": {
    "errorResult": {
      "message": "An internal error occurred and the request could not be completed.", 
      "reason": "internalError"
    }, 
    "errors": [
      {
        "message": "An internal error occurred and the request could not be completed.", 
        "reason": "internalError"
      }
    ], 
    "state": "DONE"
  }, 
  "user_email": "######"
}

After searching through other related questions on StackOverflow I eventually realised that I had set my GCS bucket region to EUROPE-WEST-1 and not the multi-region EU location. 通过在计算器上其他相关的问题,搜索后,我终于意识到,我已经把我的GCS斗区EUROPE-WEST-1而不是多区域EU位置。 Things are now working as expected. 现在一切都按预期进行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 BigQuery:将数据从GCS加载到EU数据集中 - BigQuery: Load Data into EU Dataset from GCS 无法将备份数据从GCS加载到BigQuery - Cannot load backup data from GCS to BigQuery 从 GCS 加载_Csv_data 到 Bigquery - Load_Csv_data from GCS to Bigquery 定期安排从GCS向BigQuery加载数据 - Schedule loading data from GCS to BigQuery periodically 如何从 gcs 存储桶中解压缩 tsv 文件并将其加载到 Bigquery - How to unzip and load tsv file into Bigquery from gcs bucket 如何从 GCS 将数据加载到 BigQuery(使用 load_table_from_uri 或 load_table_from_dataframe)而不复制 BQ 表中的现有数据 - how to load data into BigQuery from GCS (using load_table_from_uri or load_table_from_dataframe) without duplicating existing data in the BQ table 从 GCS 读取数据并加载到 CLOUD SQL - Read data from GCS and load into CLOUD SQL 使用通配符和自动检测将数据从GCS加载到BigQuery - Loading data from GCS to BigQuery using Wildcards and Autodetect 使用 Dataflow 的 DLP 从 GCS 读取并写入 BigQuery - 只有 50% 的数据写入 BigQuery - Used Dataflow's DLP to read from GCS and write to BigQuery - Only 50% data written to BigQuery 带GCS数据源的Bigquery表不影响改成gcs的数据 - Bigquery table with GCS data source does not affect data changed into gcs
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM