简体   繁体   English

Google Cloud BigQuery load_table_from_dataframe()实木复合地板AttributeError

[英]Google Cloud BigQuery load_table_from_dataframe() Parquet AttributeError

I am trying to use the BigQuery package to interact with Pandas DataFrames. 我正在尝试使用BigQuery包与Pandas DataFrames进行交互。 In my scenario, I query a base table in BigQuery, use .to_dataframe(), then pass that to load_table_from_dataframe() to load it into a new table in BigQuery. 在我的场景中,我在BigQuery中查询基表,使用.to_dataframe(),然后将其传递给load_table_from_dataframe()以将其加载到BigQuery中的新表中。

My original problem was that str(uuid.uuid4()) (for random ID's) was automatically being converted to bytes instead of string, so I am forcing a schema instead of allowing it to auto-detect what to make. 我最初的问题是str(uuid.uuid4())(用于随机ID)会自动转换为字节而不是字符串,因此我强制使用模式,而不是允许它自动检测要生成的内容。

Now, though, I passed a job_config with a job_config dict that contained the schema, and now I get this error: 但是,现在,我传递了一个job_config和一个包含该模式的job_config dict,现在我得到了这个错误:

File "/usr/local/lib/python2.7/dist-packages/google/cloud/bigquery/client.py", line 903, in load_table_from_dataframe 在load_table_from_dataframe中的文件“ /usr/local/lib/python2.7/dist-packages/google/cloud/bigquery/client.py”,第903行

job_config.source_format = job.SourceFormat.PARQUET AttributeError: 'dict' object has no attribute 'source_format' job_config.source_format = job.SourceFormat.PARQUET AttributeError:'dict'对象没有属性'source_format'

I already had PyArrow installed, and tried also installing FastParquet, but it didnt help, and this didn't happen before I tried to force a schema. 我已经安装了PyArrow,并尝试也安装FastParquet,但这并没有帮助,在我强制模式之前,这没有发生。

Any ideas? 有任何想法吗?

https://google-cloud-python.readthedocs.io/en/latest/bigquery/usage.html#using-bigquery-with-pandas https://google-cloud-python.readthedocs.io/en/latest/bigquery/usage.html#using-bigquery-with-pandas

https://google-cloud-python.readthedocs.io/en/latest/_modules/google/cloud/bigquery/client.html#Client.load_table_from_dataframe https://google-cloud-python.readthedocs.io/en/latest/_modules/google/cloud/bigquery/client.html#Client.load_table_from_dataframe

Looking in to the actual package it seems that it forces Parquet format, but like I said, I had no issue before, just now that I'm trying to give a table schema. 查看实际的程序包似乎可以强制使用Parquet格式,但是就像我说的那样,以前我没有问题,只是现在我正在尝试提供表模式。

EDIT: This only happens when I try to write to BigQuery. 编辑:这仅在我尝试写入BigQuery时发生。

Figured it out. 弄清楚了。 After weeding through Google's documentation I forgot to put: 在浏览完Google的文档后,我忘了输入:

load_config = bigquery.LoadJobConfig()
load_config.schema = SCHEMA

Oops. 哎呀。 Never loaded the config dict from the BigQuery package. 永远不要从BigQuery包中加载配置字典。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用多进程池通过 Python 将 load_table_from_dataframe 加载到 BigQuery - Using multiprocess Pool to load_table_from_dataframe into BigQuery with Python Google BigQuery Schema 冲突(pyarrow 错误)与使用 load_table_from_dataframe 的数字数据类型 - Google BigQuery Schema conflict (pyarrow error) with Numeric data type using load_table_from_dataframe 如何创建表编号列使用 python 到 bigquery load_table_from_dataframe - how to create table number columns use python to bigquery load_table_from_dataframe 使用 load_table_from_dataframe 方法将数据写入 BigQuery 表错误 - 'str' object 没有属性 'to_api_repr' - Write Data to BigQuery table using load_table_from_dataframe method ERROR - 'str' object has no attribute 'to_api_repr' 使用 load_table_from_dataframe 时出错 - Error while using load_table_from_dataframe 如何从 GCS 将数据加载到 BigQuery(使用 load_table_from_uri 或 load_table_from_dataframe)而不复制 BQ 表中的现有数据 - how to load data into BigQuery from GCS (using load_table_from_uri or load_table_from_dataframe) without duplicating existing data in the BQ table 将数据从Google Cloud Storage上的本地文件加载到BigQuery表 - Load data from local file on Google Cloud Storage to BigQuery table 如何从Cloud Datalab将数据框导出到BigQuery表? - How do I export a dataframe to a BigQuery table from Cloud Datalab? 从bigquery到Google云存储的卸载表非常慢 - extremely slow unloading table from bigquery to Google cloud storage 从 Cloud Functions 到 BigQuery 的 pandas df - 仅 PARQUET 和 CSV? - pandas df from Cloud Functions to BigQuery - only PARQUET and CSV?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM