How to get column name from BigQuery API?

Question

I can get column values using the following codes:

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'C:\\Users\xxx\Desktop\key.json'
bq_client = Client()
query = "SELECT msts, coreuserid, spend_usd FROM `project.f_purchase` where dt = '2019-04-02' limit 5"
query_job = bq_client.query(query)
results = query_job.result()   

for row in results:
    print("{}, {}, {}".format(row.msts, row.uid, row.spend_amount))

But as shown in the last row, this requires direct column name. Now I have multiple queries and I want to run them in a look and display the result. Is there a way like .format(row.column1, row.column2...) ? In addition, number of result columns are different for the queries.

Any help is appreciated.

Answer 1

Per BigQuery Python client documentation you can loop over the row object as follow without specifying the exact column name:

for row in query_job:  # API request - fetches results
    # Row values can be accessed by field name or index
    assert row[0] == row.name == row["name"]
    print(row)

In addition, you can always use the SchemaField values as described in this answer

result = ["{0} {1}".format(schema.name,schema.field_type) for schema in table.schema]

This is an example using a BigQuery public dataset on how to access fields without specifying the field name:

from google.cloud import bigquery
from pprint import pprint
import json

client = bigquery.Client()

query = (
    "SELECT state,max(gender) as gender FROM `bigquery-public-data.usa_names.usa_1910_2013` "
    'GROUP BY state '
    "LIMIT 10"
)
query_job = client.query(
    query,
    # Location must match that of the dataset(s) referenced in the query.
    location="US",
)  # API request - starts the query

for num, row in enumerate(query_job, start=1):  # API request - fetches results
    # Row values can be accessed by field name or index
    # assert row[0] == row.name == row["name"]
    print("{} AS {}, {} AS {}".format(row[0], query_job._query_results._properties['schema']['fields'][0]['name'], row[1], query_job._query_results._properties['schema']['fields'][1]['name']))

    #print(row[0], row[1])

print(json.dumps(query_job._query_results._properties['schema']['fields'][0]['name']))
print(query_job._query_results._properties)
#pprint(vars(query_job._query_results._properties))

Which produces the following output:

superQuery:bin tamirklein$ python test.py
AK AS state, M AS gender
AL AS state, M AS gender
AR AS state, M AS gender
AZ AS state, M AS gender
CA AS state, M AS gender
CO AS state, M AS gender
CT AS state, M AS gender
DC AS state, M AS gender
DE AS state, M AS gender
FL AS state, M AS gender

Answer 2

You can also cast the row in your for loop into dict (by dict(row) ). Then keys are the column names and you can do whatever you can do with the dictionary -- iterate over the keys (column names), values (column values) or both together, without the need to know the column names explicitly up-front.

How to get column name from BigQuery API?

Question

2 answers

solution1
5 ACCPTED 2019-04-19 05:45:03

solution2
3 2021-01-22 14:26:03

How to get column name from BigQuery API?

Question

2 answers

solution1 5 ACCPTED 2019-04-19 05:45:03

solution2 3 2021-01-22 14:26:03

solution1
5 ACCPTED 2019-04-19 05:45:03

solution2
3 2021-01-22 14:26:03