简体繁体中英

Performance difference in json data into BigQuery loading methods

原文 2019-12-02 16:47:23 6 1 python/ google-bigquery

What is the performance difference between two JSON loading methods into BigQuery: load_table_from_file(io.StringIO(json_data) vs create_rows_json

The first one loads the file as a whole and the second one streams the data. Does it mean that the first method will be faster to complete, but binary, and the second one slower, but discretionary? Any other concerns? Thanks!

1 answers

It's for two different logics and they have their own limits.

Load from file is great if you can have your data placed in files. A file can be up to 5TB in size. This load is free. You can query data immediately after completion.
The streaming insert, is great if you have your data in form of events that you can stream to BigQuery. While a streaming insert single request is limited up to 10MB, it can be super parallelized up to 1 Million rows per second, that's a big scale. Streaming rows to BigQuery has it's own cost. You can query data immediately after streaming, but for some copy and export jobs data can be available later up to 90 minutes.

BigQuery error while loading data from Cloud Storage Json

Loading data from BigQuery into Redis

Performance difference in handling data

Loading JSON file in BigQuery using Google BigQuery Client API

Loading a Lot of Data into Google Bigquery from Python

Loading Data from Google BigQuery into Spark (on Databricks)

Loading data from datasteam to BigQuery using Python

loading data from csv to bigquery with date fields

Loading dynamic schema JSON files into a BigQuery table using autodetect

Specify column mapping when loading JSON into Google BigQuery

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question BigQuery error while loading data from Cloud Storage Json Loading data from BigQuery into Redis Performance difference in handling data Loading JSON file in BigQuery using Google BigQuery Client API Loading a Lot of Data into Google Bigquery from Python Loading Data from Google BigQuery into Spark (on Databricks) Loading data from datasteam to BigQuery using Python loading data from csv to bigquery with date fields Loading dynamic schema JSON files into a BigQuery table using autodetect Specify column mapping when loading JSON into Google BigQuery

Related Tags

Performance difference in json data into BigQuery loading methods

Question

1 answers

solution1 3 ACCPTED 2019-12-02 18:08:38

solution1
3 ACCPTED 2019-12-02 18:08:38