简体   繁体   中英

Fastest way to copy over a postgres database to BigQuery

Problem statement : It is necessary to copy over a 100GB database from a GCP Postgres DB into GCP BigQuery located in the same region/availability zone. Ideally, I'd like to be able to make a full copy every 15-20 minutes.

Attempt at the solution :

  1. Using Airbyte --> take several hours which is waaaaay too slow.
  2. Using pg_dump --> Also did quiet make the mark.

Question : What other things can I try? What other tools are available which will enable me to do this? I think the issue here is the network transfer speed... I seem to be somehow getting about 2Mb/s or similar data transfer speed... which is super slow.

Any suggestions about how I can make this fast in GCP ?

If you use Cloud SQL for Postgres, the easiest way is to use federated queries . You can create a connection in BigQuery to Cloud SQL, to query Cloud SQL directly from BigQuery.

Perform an INSERT INTO.... SELECT. And your data are copied.

Note, you must have a public IP on your Cloud SQL instance (you can keep no authorized network, therefore your database is still locked and protected from the wild internet)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM