简体   繁体   English

在 apache-beam 中使用 python ReadFromKafka 不支持的信号:2

[英]ReadFromKafka with python in apache-beam Unsupported signal: 2

I´ve been strugglin making this work, I know this is a cross-language transform and all of that and I installed the Java jdk on my pc (when I write java -version on cmd I get correct information and all of that) but when I am trying to make a simple pipeline work:我一直在努力完成这项工作,我知道这是一个跨语言转换以及所有这些,我在我的电脑上安装了 Java jdk(当我在 cmd 上编写 java -version 时,我得到了正确的信息和所有这些)但是当我试图使一个简单的管道工作时:

import apache_beam as beam
from apache_beam.io.external.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS']='credentialsOld.json'

  
def main():
print('======================================================')
beam_options = PipelineOptions(runner='DataflowRunner',temp_location=temp_location,staging_location=staging_location,project=project,experiments=['use_runner_v2'],streaming=True)

with beam.Pipeline(options=beam_options) as p:
    msgs = p | 'ReadKafka' >> ReadFromKafka(consumer_config={'bootstrap.servers':'xxxxx-xxxxx...','group_id':'testAB'},topics=['users'])
    msgs | beam.FlatMap(print)
    
if __name__ == '__main__':
    
  main()

I get this error: ValueError: Unsupported signal: 2我收到此错误:ValueError:不支持的信号:2

I have tried adding the parameter expansion_service= 'beam:external:java:kafka:read:v1' to the ReadFromKafka but then I get:我尝试将参数 expansion_service= 'beam:external:java:kafka:read:v1' 添加到 ReadFromKafka 但后来我得到:

status = StatusCode.UNAVAILABLE状态 = StatusCode.UNAVAILABLE

details = "DNS resolution failed for beam:external:java:kafka:read:v1: UNKNOWN: OS Error" details = "DNS 解析失败 beam:external:java:kafka:read:v1: UNKNOWN: OS Error"

Im working on a venv python enviroment if this info can be usefull and my kafka cluster is on confluent cloud.我在 venv python 环境中工作,如果此信息有用并且我的 kafka 集群位于汇合云上。

Im also getting this runtime error: RuntimeError: java.lang.RuntimeException: Failed to get dependencies of beam:transform:org.apache.beam:kafka_read_without_metadata:v1 from spec urn: "beam:transform:org.apache.beam:kafka_read_without_metadata:v1"我也收到了这个运行时错误:RuntimeError: java.lang.RuntimeException: Failed to get dependencies of beam:transform:org.apache.beam:kafka_read_without_metadata:v1 from spec urn: "beam:transform:org.apache.beam:kafka_read_without_metadata: v1"

EDIT: Im getting the bootstrap server option from here编辑:我从这里获取引导服务器选项在此处输入图像描述

My mistake was that I was skippig the step where I have to start a expansion_service, I did that with this command java -jar beam-sdks-java-io-expansion-service-2.37.0.jar 8088 --javaClassLookupAllowlistFile='*' after downloading the beam-sdks-java-io-expansion-service-2.37.0.jar from https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-expansion-service/2.36.0 and then specifying the port in expansion_service='localhost:8088'我的错误是我跳过了必须启动 expansion_service 的步骤,我是用这个命令java -jar beam-sdks-java-io-expansion-service-2.37.0.jar 8088 --javaClassLookupAllowlistFile='*'完成的从https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-expansion-service/2.36.0 下载 beam-sdks-java-io-expansion-service-2.37.0.jar和然后在 expansion_service='localhost:8088' 中指定端口

Then I had two minor mistakes one was that I was using the JDK 18 and I think it wasnt compatible https://beam.apache.org/get-started/quickstart-java/ so I switched to JDK 17 and used python 3.8 instead of python 3.10然后我犯了两个小错误,一个是我使用的是 JDK 18,我认为它不兼容https://beam.apache.org/get-started/quickstart-java/所以我切换到 JDK 17 并改用 python 3.8 python 3.10

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何解决 Apache-Beam 中的 BeamDeprecationWarning - How to solve BeamDeprecationWarning in Apache-Beam apache-beam 从 GCS 桶的多个文件夹中读取多个文件并加载它 biquery python - apache-beam reading multiple files from multiple folders of GCS buckets and load it biquery python httplib2.socks.HTTPError: (403, b'Forbidden') python apache-beam 数据流 - httplib2.socks.HTTPError: (403, b'Forbidden') python apache-beam dataflow 无法使用 Apache-Beam JDBC 连接到 Cloud SQL - Cannot connect to Cloud SQL using Apache-Beam JDBC 在现有的谷歌云 VM 上运行 Apache-beam 管道作业 - Run Apache-beam pipeline job on existing google cloud VM 为什么在 apache-beam 中出现错误:“TypeError:使用 SessionWindow 时无法将 GlobalWindow 转换为 _IntervalWindowBase? - Why in apache-beam I get error: "TypeError: Cannot convert GlobalWindow to _IntervalWindowBase when using SessionWindow? 使用哪个 apache-beam 功能来读取管道中的第一个 function 并获取 output - Which apache-beam feature to use to just read a function as first in the pipeline and take the output 我们可以在 apache-beam 的批处理管道中使用 Windows + GroupBy 或 State & timely 打破 fusion b/w ParDo 吗? - Can we break fusion b/w ParDo using Windows + GroupBy or State & timely in batch pipeline of apache-beam? CombineFn for Python dict in Apache Beam pipeline - CombineFn for Python dict in Apache Beam pipeline Apache Beam Python SDK 中是否有等效的 withFormatFunction? - Is there withFormatFunction equivalent in Apache Beam Python SDK?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM