简体   繁体   English

将Big Query Table下载为JSON

[英]Download Big Query Table as JSON

I would like to download an existing Big Query table as JSON for manipulating the one of column which has long string. 我想将现有的Big Query表下载为JSON,以处理具有长字符串的列之一。

Big Query table has been ingested with Datastore backup file from App Engine to GCS. 已将大查询表与数据存储备份文件一起从App Engine提取到GCS。 I used Big Query to read Datastore backup file from GCS and created a table out of it, which has resulted a repeated string column as very long string. 我使用Big Query从GCS读取了数据存储区备份文件,并从中创建了一个表,这导致重复的字符串列非常长。

I couldn't parse the long string, so that I would need to download the table as JSON and re up to Big Query as new table. 我无法解析长字符串,因此我需要将表下载为JSON并重新存储为Big Query作为新表。 I would need an advice for this approach 我需要这种方法的建议

There are 3 ways to export your data: 有3种导出数据的方法:

  1. Single URI (1 file, limit 1GB, most probably you are using this) 单个URI(1个文件,限制为1GB,很可能您正在使用此文件)

['gs://my-bucket/file-name.json'] [ 'GS://my-bucket/file-name.json']

Creates: 创建:

gs://my-bucket/file-name.json GS://my-bucket/file-name.json

  1. Single wildcard URI (multiple files are created each 1GB) 单个通配符URI(每个1GB创建多个文件)

['gs://my-bucket/file-name-*.json'] [ 'GS://my-bucket/file-name-*.json']

Creates: 创建:

gs://my-bucket/file-name-000000000000.json GS://my-bucket/file-name-000000000000.json
gs://my-bucket/file-name-000000000001.json GS://my-bucket/file-name-000000000001.json
gs://my-bucket/file-name-000000000002.json ... gs://my-bucket/file-name-000000000002.json ...

  1. Multiple wildcard URIs (this needs Hadoop) 多个通配符URI(这需要Hadoop)

gs://my-bucket/file-name-{worker number}-*.json gs:// my-bucket / file-name- {worker number}-*。json

Creates: 创建:

This example assumes that BigQuery creates 80 sharded files in each partition. 本示例假定BigQuery在每个分区中创建80个分片文件。

gs://my-bucket/file-name-1-000000000000.json GS://my-bucket/file-name-1-000000000000.json
gs://my-bucket/file-name-1-000000000001.json GS://my-bucket/file-name-1-000000000001.json
... ...
gs://my-bucket/file-name-1-000000000080.json GS://my-bucket/file-name-1-000000000080.json
gs://my-bucket/file-name-2-000000000000.json GS://my-bucket/file-name-2-000000000000.json
gs://my-bucket/file-name-2-000000000001.json GS://my-bucket/file-name-2-000000000001.json
... ...
gs://my-bucket/file-name-2-000000000080.json GS://my-bucket/file-name-2-000000000080.json
gs://my-bucket/file-name-3-000000000000.json GS://my-bucket/file-name-3-000000000000.json
gs://my-bucket/file-name-3-000000000001.json GS://my-bucket/file-name-3-000000000001.json
... ...
gs://my-bucket/file-name-3-000000000080.json GS://my-bucket/file-name-3-000000000080.json

Read more at: 阅读更多信息:

https://cloud.google.com/bigquery/exporting-data-from-bigquery https://cloud.google.com/bigquery/exporting-data-from-bigquery

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM