简体   繁体   中英

Select columns based on values of another column if present in a list

I'm trying to select columns A and B from my table in Bigquery using pandas.read_gbq based on values of column C if present in a list. However, when I use format to insert the list in my query string, the contents of the list are surrounded by [] square brackets. This breaks my query.

I used replace on the query string to manually remove the square brackets.

values_in_list = ['a', 'b', 'c']
query = """
SELECT
  column_A,
  column_B

FROM
  my_table

WHERE
 column_C IN ({})
""".format(values_in_list).replace('[', '').replace(']', '')
query_df = pandas.read_gbq(query, project_id='some-project', dialect='standard')

This gets the job done. I was wondering if there was a more elegant solution than brute forcing it.

I'm not sure if pandas.read_gbq supports ArrayQueryParameters in the query_config keyword arg. Here's my workaround:

from google.cloud import bigquery
client = bigquery.Client()

values_in_list = ['a', 'b', 'c']
query = """
SELECT
  column_A,
  column_B

FROM
  my_table

WHERE
 column_C IN UNNEST(@col_c_vals)
"""

query_params = [bigquery.ArrayQueryParameter('col_c_vals', 'STRING', values_in_list)]
job_config = bigquery.QueryJobConfig()
job_config.query_parameters = query_params
query_df = client.query(query, job_config=job_config).to_dataframe()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM