简体   繁体   English

Bigquery 将列添加到表架构

[英]Bigquery add columns to table schema

I am trying to add new column to BigQuery existing table.我正在尝试将新列添加到 BigQuery 现有表。 I have tried bq command tool and API approach.我尝试过 bq 命令工具和 API 方法。 I get following error when making call to Tables.update().调用 Tables.update() 时出现以下错误。

I have tried with providing full schema with additional field and that also gives me same error as shown below.我尝试提供带有附加字段的完整架构,这也给了我同样的错误,如下所示。

With API I get following Error:使用 API 我得到以下错误:

{
    "schema": {
        "fields": [{
            "name": "added_column",
            "type": "integer",
            "mode": "nullable"
        }]
    }
}



{
    "error": {
        "errors": [{
            "domain": "global",
            "reason": "invalid",
            "message": "Provided Schema does not match Table [blah]"
        }],
        "code": 400,
        "message": "Provided Schema does not match Table [blah]"
    }
}

With BQ tool I get following error:使用 BQ 工具出现以下错误:

./bq update -t blah added_column:integer

BigQuery error in update operation: Provided Schema does not match Table [blah]更新操作中的 BigQuery 错误:提供的架构与表不匹配 [blah]

Try this:试试这个:

bq --format=prettyjson show yourdataset.yourtable > table.json

Edit table.json and remove everything except the inside of "fields" (eg keep the [ { "name": "x" ... }, ... ] ).编辑 table.json 并删除除“字段”内部之外的所有内容(例如保留[ { "name": "x" ... }, ... ] )。 Then add your new field to the schema.然后将您的新字段添加到架构中。

Or pipe through jq或者通过jq管道

bq --format=prettyjson show yourdataset.yourtable | jq .schema.fields > table.json

Then run:然后运行:

bq update yourdataset.yourtable table.json

You can add --apilog=apilog.txt to the beginning of the command line which will show exactly what is sent / returned from the bigquery server.您可以将--apilog=apilog.txt添加到命令行的开头,这将准确显示从 bigquery 服务器发送/返回的内容。

In my case I was trying to add a REQUIRED field to a template table, and was running into this error.在我的例子中,我试图将一个REQUIRED字段添加到模板表中,但遇到了这个错误。 Changing the field to NULLABLE , let me update the table.将字段更改为NULLABLE ,让我更新表。

Also more recent version on updates for anybody stumbling from Google.还有更新的更新版本,供任何从 Google 绊倒的人使用。

#To create table
bq mk --schema domain:string,pageType:string,source:string -t Project:Dataset.table
#Or using schema file
bq mk --schema SchemaFile.json -t Project:Dataset.table


#SchemaFile.json format
[{                                                                                                                                                                                                                                                
  "mode": "REQUIRED",
  "name": "utcTime",
  "type": "TIMESTAMP"
},    
{
  "mode": "REQUIRED",
  "name": "domain",
  "type": "STRING"
},  
{
  "mode": "NULLABLE",
  "name": "testBucket",
  "type": "STRING"
},  
{
  "mode": "REQUIRED",
  "name": "isMobile",
  "type": "BOOLEAN"                                                                                                                                                                                                                       
},
{
  "mode": "REQUIRED",
  "name": "Category",
  "type": "RECORD",
  "fields": [
    {
      "mode": "NULLABLE",
      "name": "Type",
      "type": "STRING"
     },
     {
       "mode": "REQUIRED",
       "name": "Published",
       "type": "BOOLEAN"
     }
    ]
}]

# TO update
bq update --schema UpdatedSchema.json -t Project:Dataset.table
# Updated Schema contains old and any newly added columns 

Some docs for template tables模板表的一些文档

Example using the BigQuery Node JS API:使用 BigQuery Node JS API 的示例:

const fieldDefinition = {
    name: 'nestedColumn',
    type: 'RECORD',
    mode: 'REPEATED',
    fields: [
        {name: 'id', type: 'INTEGER', mode: 'NULLABLE'},
        {name: 'amount', type: 'INTEGER', mode: 'NULLABLE'},
    ],
}; 

const table = bigQuery.dataset('dataset1').table('source_table_name');
const metaDataResult = await table.getMetadata();
const metaData = metaDataResult[0];

const fields = metaData.schema.fields;
fields.push(fieldDefinition);

await table.setMetadata({schema: {fields}});

I was stuck trying to add columns to an existing table in BigQuery using the Python client and found this post several times.我一直在尝试使用 Python 客户端向 BigQuery 中的现有表添加列,并多次发现这篇文章。 I'll then let the piece of code that solved it for me, in case someone's having the same problem:然后我会让这段代码为我解决它,以防有人遇到同样的问题:

# update table schema
bigquery_client = bigquery.Client()
dataset_ref = bigquery_client.dataset(dataset_id)
table_ref = dataset_ref.table(table_id)
table = bigquery_client.get_table(table_ref)
new_schema = list(table.schema)
new_schema.append(bigquery.SchemaField('LOLWTFMAN','STRING'))
table.schema = new_schema
table = bigquery_client.update_table(table, ['schema'])  # API request

You can add Schema to your table through GCP console Easier and Clear:-您可以通过 GCP 控制台更轻松、更清晰地将架构添加到您的表中:-

将架构添加到您的表

Here's a quick snippet I wrote that will dynamically add schema columns if the data coming in (from a server, etc) doesn't match what exists currently in a BigQuery Table:这是我编写的一个快速片段,如果传入的数据(来自服务器等)与 BigQuery 表中当前存在的数据不匹配,它将动态添加架构列:

def verify_schema(client, table, data_dict):
    schema = list(table.schema)
    existing_schema_names = [schema.name for schema in schema]
    validation_list = [True if schema_field in existing_schema_names else schema.append(
        bigquery.SchemaField(name=schema_field, field_type='STRING', mode='NULLABLE')) for schema_field in data_dict.keys()]
    if None in validation_list:
        table.schema = schema
        client.update_table(table, ['schema'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 BigQuery - 将列添加到现有表 - BigQuery - Add column to existing table BigQuery - 如何在创建视图时更改嵌套列的架构顺序? - BigQuery - how to change the order of the schema with nested columns when creating a view? mk 操作中的 BigQuery 错误:读取表时出错...“无法将分区键 &lt; &gt;(类型:TYPE_INT64)添加到架构 - BigQuery error in mk operation: Error while reading table... "failed to add partition key < > (type: TYPE_INT64) to schema 从 pandas 数据框创建一个 BigQuery 表,无需明确指定架构 - Create a BigQuery table from pandas dataframe, WITHOUT specifying schema explicitly 如何将多个文件(相同模式)从 LOCAL 加载到 BigQuery 中的表中? - How to load multiple files (same schema) from LOCAL into a table in BigQuery? 将文件名添加到 BigQuery 本机表中的列之一 - Add filename into one of the column in BigQuery native table Google BigQuery:创建表时将日期添加到表名 - Google BigQuery: Add date to table name when creating a table 如何在 BigQuery 表的 450 列中找到值“-32767”? - How to find the value "-32767" in 450 columns in a BigQuery table? 在目标表架构中使用 BIGDECIMAL 时无法设置对 BigQuery 的 GCP Pub/sub 订阅 - Unable to setup GCP Pub/sub subscription to BigQuery when BIGDECIMAL in target table schema 每天我都会在 BigQuery 中收到一个新表,我想将这个新表数据连接到主表,数据集架构相同 - Daily I’m receiving a new table in the BigQuery, I want concatenate this new table data to the main table, dataset schema are same
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM