简体   繁体   English

Bigquery Python API不支持的数组

[英]Arrays not supported in Bigquery Python API

The support for python Bigquery API indicates that arrays are possible, however, when passing from a pandas dataframe to bigquery there is a pyarrow struct issue. 对python Bigquery API的支持表明可以使用数组,但是,当从熊猫数据帧传递到bigquery时,会出现pyarrow结构问题。

The only way round it seems its to drop columns then use JSON Normalise for a separate table. 唯一的办法就是删除列,然后对单独的表使用JSON Normalize。

'''from google.cloud import bigquery
 project = 'lake'
 client = bigquery.Client(credentials=credentials, project=project)
 dataset_ref = client.dataset('XXX')
 table_ref = dataset_ref.table('RAW_XXX')
 job_config = bigquery.LoadJobConfig()
 job_config.autodetect = True
 job_config.write_disposition = 'WRITE_TRUNCATE'

 client.load_table_from_dataframe(appended_data, table_ref,job_config=job_config).result()'''

This is the error recieved. 这是收到的错误。 NotImplementedError: struct NotImplementedError:结构

This is currently not supported due to how parquet serialization works. 由于木地板序列化的工作方式,目前不支持此功能。

A feature request to upload pandas DataFrame containing arrays was created at the client library's GitHub: 在客户端库的GitHub上创建了一个功能请求,以上传包含数组的pandas DataFrame:

https://github.com/googleapis/google-cloud-python/issues/8544 https://github.com/googleapis/google-cloud-python/issues/8544

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM