[英]Bigquery converts my string field into integer while loading json file with Python
{"number":"1234123"} I am assigning this data to my Bigquery table using bigquery.LoadJobConfig in python. The type of my number column in my bigquery table is string. {"number":"1234123"} 我在 python 中使用 bigquery.LoadJobConfig 将此数据分配给我的 Bigquery 表。我的 bigquery 表中我的数字列的类型是字符串。 When I do the load operation, it converts the data type in my bigquery table to integer. How can I solve this?当我执行加载操作时,它会将我的 bigquery 表中的数据类型转换为 integer。我该如何解决这个问题? The file type I loaded: json.我加载的文件类型:json。
job_config = bigquery.LoadJobConfig(
create_disposition=bigquery.CreateDisposition.CREATE_IF_NEEDED,
write_disposition=bigquery.WriteDisposition.WRITE_APPEND,
source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,autodetect=True
)
Additionally: When I set autodetect to False, I get an error like Error while reading data, error message: JSON table encountered too many errors另外:当我将autodetect设置为False时,我在读取数据时收到类似Error的错误,错误信息:JSON table encountered too many errors
I recommend you to pass a BigQuery
schema
to prevent this situation, instead to use autodetect=True
, example:我建议您传递一个BigQuery
schema
来防止这种情况,而不是使用autodetect=True
,例如:
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
# TODO(developer): Set table_id to the ID of the table to create.
# table_id = "your-project.your_dataset.your_table_name"
job_config = bigquery.LoadJobConfig(
schema=[
bigquery.SchemaField("number", "STRING")
],
create_disposition=bigquery.CreateDisposition.CREATE_IF_NEEDED,
write_disposition=bigquery.WriteDisposition.WRITE_APPEND,
source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,
autodetect=False
)
uri = "gs://cloud-samples-data/bigquery/us-states/us-states.json"
load_job = client.load_table_from_uri(
uri,
table_id,
location="US", # Must match the destination dataset location.
job_config=job_config,
) # Make an API request.
load_job.result() # Waits for the job to complete.
destination_table = client.get_table(table_id)
print("Loaded {} rows.".format(destination_table.num_rows))
In this example I set the schema of BigQuery
table and autodetect
to False
.在此示例中,我将BigQuery
表的架构和autodetect
设置为False
。 If you use autodetect
to True
, you can't have a control on your field types.如果您对True
使用autodetect
,则无法控制您的字段类型。
You can check the documentation to have more information.您可以查看文档以获取更多信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.