[英]Arrays not supported in Bigquery Python API
The support for python Bigquery API indicates that arrays are possible, however, when passing from a pandas dataframe to bigquery there is a pyarrow struct issue. 对python Bigquery API的支持表明可以使用数组,但是,当从熊猫数据帧传递到bigquery时,会出现pyarrow结构问题。
The only way round it seems its to drop columns then use JSON Normalise for a separate table. 唯一的办法就是删除列,然后对单独的表使用JSON Normalize。
'''from google.cloud import bigquery
project = 'lake'
client = bigquery.Client(credentials=credentials, project=project)
dataset_ref = client.dataset('XXX')
table_ref = dataset_ref.table('RAW_XXX')
job_config = bigquery.LoadJobConfig()
job_config.autodetect = True
job_config.write_disposition = 'WRITE_TRUNCATE'
client.load_table_from_dataframe(appended_data, table_ref,job_config=job_config).result()'''
This is the error recieved. 这是收到的错误。 NotImplementedError: struct NotImplementedError:结构
This is currently not supported due to how parquet serialization works. 由于木地板序列化的工作方式,目前不支持此功能。
A feature request to upload pandas DataFrame containing arrays was created at the client library's GitHub: 在客户端库的GitHub上创建了一个功能请求,以上传包含数组的pandas DataFrame:
https://github.com/googleapis/google-cloud-python/issues/8544 https://github.com/googleapis/google-cloud-python/issues/8544
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.