簡體   English   中英

無法使用 Apache-Beam JDBC 連接到 Cloud SQL

[英]Cannot connect to Cloud SQL using Apache-Beam JDBC

我正在嘗試使用 Python SDK io.jdbc 模塊連接到 Cloud SQL,更具體地說是 ReadFromJdbc 類,記錄在此處 - https://beam.apache.org/releases/pydoc/current/apache_beam.io.jdbc.html

基於它和此處使用 JDBC 連接到 Cloud MySQL 的信息- https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/blob/main/docs/jdbc-mysql.md我編寫了以下代碼

import apache_beam as beam
import apache_beam.io.jdbc as jdbc
import typing
import apache_beam.coders as coders

from apache_beam.options.pipeline_options import PipelineOptions

pipeline_options = {
    'project': 'project-name',
    'runner': 'DataflowRunner',
    'region': 'europe-central2',
    'staging_location':"gs://temp",
    'temp_location':"gs://temp",
    'template_location':"gs://templates/temp_name"
}
pipeline_options = PipelineOptions.from_dictionary(pipeline_options)


serviceAccount = r'path\to\serviceaccount.json'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = serviceAccount

ExampleRow = typing.NamedTuple('ExampleRow',
                               [('id', int), ('migration', str)])
coders.registry.register_coder(ExampleRow, coders.RowCoder)


with beam.Pipeline(options=pipeline_options) as p:
    res = (
        p
        | "Read database list" >> jdbc.ReadFromJdbc(
            table_name='table',
            driver_class_name='com.mysql.jdbc.Driver',
            jdbc_url='jdbc:mysql:///<DATABASE_NAME>?cloudSqlInstance=<INSTANCE_CONNECTION_NAME>&socketFactory=com.google.cloud.sql.mysql.SocketFactory&user=<MYSQL_USER_NAME>&password=<MYSQL_USER_PASSWORD>',
            username='user',
            password='pass',
            query = "select id, migration from db.table;",
            fetch_size=1,
            classpath=["com.google.cloud.sql:mysql-socket-factory-connector-j-8:1.7.2"],
            expansion_service = 'host:6666'
        )
        | "Print results" >> beam.io.WriteToText(r'gs://output/out.csv')
    )

對於擴展服務,我設置了 WLS2 python 環境,如此處記錄 - https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service

不幸的是,我收到此錯誤:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:6666: WSA Error"
        debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:6666: WSA Error {grpc_status:14, created_time:"2022-12-08T15:43:05.445755053+00:00"}"

我試圖將expansion_service切換到我從wls hostname -I獲得的特定 IP,但它產生了相同的結果,即使你可以訪問它(使用 ping 測試並托管了一個網絡服務器)。

我做錯了什么嗎? 我很難相信連接到 Cloud SQL 如此困難,所以我一定是......

apache_beam.io.jdbc模塊下的轉換是在 Beam Java SDK 中實現的跨語言轉換。 因此,在管道構建期間,Python SDK 將連接到Java expansion service以擴展這些轉換。 您按照說明創建了Python expansion service

我認為最簡單的做法是使用默認的擴展服務。

  • 首先,在構建管道的計算機上安裝 Java 運行時,並確保java命令可用。
  • 使用以下轉換從 Cloud SQL 讀取,
       p | "Read database list" >> jdbc.ReadFromJdbc(
            table_name='table',
            driver_class_name='com.mysql.jdbc.Driver',
            jdbc_url='jdbc:mysql:///<DATABASE_NAME>?cloudSqlInstance=<INSTANCE_CONNECTION_NAME>&socketFactory=com.google.cloud.sql.mysql.SocketFactory&user=<MYSQL_USER_NAME>&password=<MYSQL_USER_PASSWORD>',
            username='user',
            password='pass',
            query = "select id, migration from db.table;",
            fetch_size=1,
            classpath=["com.google.cloud.sql:mysql-socket-factory-connector-j-8:1.7.2"]
        )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM