简体   繁体   中英

How to export a huge table from BigQuery into a Google cloud bucket as one file

I am trying to export a huge table (2,000,000,000 rows, roughly 600GB in size) from BigQuery into a google bucket as a single file. All tools suggested in Google's Documentation are limited in export size and will create multiple files. Is there a pythonic way to do it without needing to hold the entire table in the memory?

While perhaps there are other ways to make it as a script, the recommended solution is to merge the files using Google Storage compose action.

What you have to do is:

  • export in CSV format
  • this produces many files
  • run the compose action batched from 32 files until the final one, the big file is merged

All this can be combined in a cloud Workflow, there is a tutorial here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM