简体   繁体   中英

exporting data from bigquery to csv

i have a record set of <100k rows, big query wont let me download this to my computer in csv format, claiming its very big and i have to store it first - is there any work around? I want to load my output into R and the easiest way is using csv

You need to use Google Cloud Storage for your export job. Exporting data from BigQuery is explained here , check also the variants for different path syntaxes.

Then you can download the files from GCS to your local storage. There is no way to directly download from BigQuery large data directly to your local computer. You need to do via GCS.

I'm using the following python script for this task, it can handle large datasets without loading them into the memory.

Make sure to install the dependencies and change the variables:

pip install google.cloud google-cloud-bigquery

Change the variables, the query, project, output file, and the file encoding (if required) to fit your needs

from google.cloud import bigquery
import codecs
import csv

# Output file
output_file = "output.csv"

# GCP project
project="<some-project>"

# File encoding - utf-8-sig codec will remove BOM if present and support excel 
file_encoding="utf-8-sig"

# The query to execute
query = """
SELECT * from my-table
"""

client = bigquery.Client(project=project)
query_job = client.query(query)
result = query_job.result()
schema = result.schema


with codecs.open(output_file,"w",encoding=file_encoding) as f:
    writer = csv.writer(f)
    # Write headers
    header = [f_name.name for f_name in schema ]
    writer.writerow(header)
    # Write data to file
    for row in query_job:
        writer.writerow(row)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM