简体   繁体   中英

Can a private Cloud data fusion connect to the internet?

Our application is made of a spring-bot app server deployed through "cloud run" and a "cloud sql postgres" database.

The database is private and connected to a private VPC .
The app server can connect to the database through a gateway to this private VPC provided by the "cloud run" configuration.

We'd like to feed this database with "cloud data fusion" (CDF) periodically. CDF should fetch data from AWS S3 and push it into our database.

We've designed and validated a pipeline for that purpose but we're facing a network paradox:

  • Either CDF is public, can read from S3 over internet, but can't reach the cloud database
  • or CDF is private, can reach our database but can't reach internet for S3 fetching...

How can CDF both write to the private database and read data from the internet ?
I'm surprised that a CDF instance, even being private, can't establish an EGRES connection to an internet resouce.

Cloud Data fusion is a tool that help you to build pipeline (based on CDAP ). If you set the Data Fusion private, it's the access to the tool that is private, not the runtime, On Google Cloud, the pipeline runs on Dataproc cluster .

So now, the question is: Can your Dataproc cluster reach internet and your database?

  1. If your cluster run in the same VPC as your Cloud SQL database private IP connection, and there is no firewall rule that prevent the communication, it's OK
  2. If your Compute Engines that compose your cluster have public IP, no problem, you can access to public URL. Else, as said by John Hanley, you can create a Cloud NAT to allow your Compute Engine to initiate call to external URL.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM