简体   繁体   中英

How to upload an array of strings to BigQuery using python

I am trying to upload a dataframe to bigquery which contains a column that is a list of users.

A typical value in the column would be ['USER1', 'USER2', 'USER3'].

No matter how much I try though when trying to specify the BQ schema, I am never able to manage to get it to understand it is an array of strings.

Here is an example of what I tried for some reference.

bq_schema = [ 
            bigquery.SchemaField("reply_users", field_type='RECORD', fields=[bigquery.SchemaField("user", "STRING", mode="REPEATED")]),
        ]

I've tried and tried but I just cannot work out how to do this.

Any advice greatly appreciated.

With that schema, your dataframe should look like this. See code below:

from google.cloud import bigquery
import pandas as pd

client = bigquery.Client()

table_id = "project_id.sandbox.nested_from_df"
job_config = bigquery.LoadJobConfig(
    schema=[
        bigquery.SchemaField("reply_users", field_type='RECORD', fields=[bigquery.SchemaField("user", "STRING", mode="REPEATED")]),
    ],
)
data = {"reply_users": [{ "user": ["user1","user2"]}]}
df = pd.DataFrame(data)
print(df)

job = client.load_table_from_dataframe(df, table_id, job_config=job_config)
job.result()

Output (dataframe):

在此处输入图像描述

Loaded data:

在此处输入图像描述

Thanks a lot.

I've had another dig through this in the morning and see where I went wrong.

My data is in the format:

data = {"reply_users": ["user1","user2"]}

Given this, the correct format for the schema is:

job_config = bigquery.LoadJobConfig(
    schema=[
        bigquery.SchemaField("reply_users", field_type='STR', mode='REPEATED')

My issue was that some other function had converted the list in python into a string without me realising so this was just chopping up the string letter by letter.

Appreciate the help though, finally understand how this works

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM