Bigquery 将列添加到表架构

Question

我正在尝试将新列添加到 BigQuery 现有表。 我尝试过 bq 命令工具和 API 方法。 调用 Tables.update() 时出现以下错误。

我尝试提供带有附加字段的完整架构，这也给了我同样的错误，如下所示。

使用 API 我得到以下错误：

{
    "schema": {
        "fields": [{
            "name": "added_column",
            "type": "integer",
            "mode": "nullable"
        }]
    }
}



{
    "error": {
        "errors": [{
            "domain": "global",
            "reason": "invalid",
            "message": "Provided Schema does not match Table [blah]"
        }],
        "code": 400,
        "message": "Provided Schema does not match Table [blah]"
    }
}

使用 BQ 工具出现以下错误：

./bq update -t blah added_column:integer

更新操作中的 BigQuery 错误：提供的架构与表不匹配 [blah]

Answer 1

试试这个：

bq --format=prettyjson show yourdataset.yourtable > table.json

编辑 table.json 并删除除“字段”内部之外的所有内容（例如保留[ { "name": "x" ... }, ... ] ）。 然后将您的新字段添加到架构中。

或者通过jq管道

bq --format=prettyjson show yourdataset.yourtable | jq .schema.fields > table.json

然后运行：

bq update yourdataset.yourtable table.json

您可以将--apilog=apilog.txt添加到命令行的开头，这将准确显示从 bigquery 服务器发送/返回的内容。

Answer 2

在我的例子中，我试图将一个REQUIRED字段添加到模板表中，但遇到了这个错误。 将字段更改为NULLABLE ，让我更新表。

还有更新的更新版本，供任何从 Google 绊倒的人使用。

#To create table
bq mk --schema domain:string,pageType:string,source:string -t Project:Dataset.table
#Or using schema file
bq mk --schema SchemaFile.json -t Project:Dataset.table


#SchemaFile.json format
[{                                                                                                                                                                                                                                                
  "mode": "REQUIRED",
  "name": "utcTime",
  "type": "TIMESTAMP"
},    
{
  "mode": "REQUIRED",
  "name": "domain",
  "type": "STRING"
},  
{
  "mode": "NULLABLE",
  "name": "testBucket",
  "type": "STRING"
},  
{
  "mode": "REQUIRED",
  "name": "isMobile",
  "type": "BOOLEAN"                                                                                                                                                                                                                       
},
{
  "mode": "REQUIRED",
  "name": "Category",
  "type": "RECORD",
  "fields": [
    {
      "mode": "NULLABLE",
      "name": "Type",
      "type": "STRING"
     },
     {
       "mode": "REQUIRED",
       "name": "Published",
       "type": "BOOLEAN"
     }
    ]
}]

# TO update
bq update --schema UpdatedSchema.json -t Project:Dataset.table
# Updated Schema contains old and any newly added columns

模板表的一些文档

Answer 3

使用 BigQuery Node JS API 的示例：

const fieldDefinition = {
    name: 'nestedColumn',
    type: 'RECORD',
    mode: 'REPEATED',
    fields: [
        {name: 'id', type: 'INTEGER', mode: 'NULLABLE'},
        {name: 'amount', type: 'INTEGER', mode: 'NULLABLE'},
    ],
}; 

const table = bigQuery.dataset('dataset1').table('source_table_name');
const metaDataResult = await table.getMetadata();
const metaData = metaDataResult[0];

const fields = metaData.schema.fields;
fields.push(fieldDefinition);

await table.setMetadata({schema: {fields}});

Answer 4

我一直在尝试使用 Python 客户端向 BigQuery 中的现有表添加列，并多次发现这篇文章。 然后我会让这段代码为我解决它，以防有人遇到同样的问题：

# update table schema
bigquery_client = bigquery.Client()
dataset_ref = bigquery_client.dataset(dataset_id)
table_ref = dataset_ref.table(table_id)
table = bigquery_client.get_table(table_ref)
new_schema = list(table.schema)
new_schema.append(bigquery.SchemaField('LOLWTFMAN','STRING'))
table.schema = new_schema
table = bigquery_client.update_table(table, ['schema'])  # API request

Answer 5

您可以通过 GCP 控制台更轻松、更清晰地将架构添加到您的表中：-

Answer 6

这是我编写的一个快速片段，如果传入的数据（来自服务器等）与 BigQuery 表中当前存在的数据不匹配，它将动态添加架构列：

def verify_schema(client, table, data_dict):
    schema = list(table.schema)
    existing_schema_names = [schema.name for schema in schema]
    validation_list = [True if schema_field in existing_schema_names else schema.append(
        bigquery.SchemaField(name=schema_field, field_type='STRING', mode='NULLABLE')) for schema_field in data_dict.keys()]
    if None in validation_list:
        table.schema = schema
        client.update_table(table, ['schema'])

Bigquery 将列添加到表架构

问题描述

6 个解决方案

解决方案1
45 已采纳 2013-05-23 01:01:47

解决方案2
4 2016-05-03 18:36:18

解决方案3
3 2018-11-13 10:20:19

解决方案4
2 2018-03-13 19:41:43

解决方案5
0 2020-06-14 16:16:15

解决方案6
0 2022-09-11 22:06:00

Bigquery 将列添加到表架构

问题描述

6 个解决方案

解决方案1 45 已采纳 2013-05-23 01:01:47

解决方案2 4 2016-05-03 18:36:18

解决方案3 3 2018-11-13 10:20:19

解决方案4 2 2018-03-13 19:41:43

解决方案5 0 2020-06-14 16:16:15

解决方案6 0 2022-09-11 22:06:00

解决方案1
45 已采纳 2013-05-23 01:01:47

解决方案2
4 2016-05-03 18:36:18

解决方案3
3 2018-11-13 10:20:19

解决方案4
2 2018-03-13 19:41:43

解决方案5
0 2020-06-14 16:16:15

解决方案6
0 2022-09-11 22:06:00