简体   繁体   English

云DataFlowRunner中的python-mysql?

[英]python-mysql in Cloud DataFlowRunner?

I currently have some code that queries mysql. 我目前有一些查询mysql的代码。 I'd like to run this code as part of a Apache Beam Pipeline in the DataFlowRunner. 我想将此代码作为DataFlowRunner中Apache Beam Pipeline的一部分运行。 Each time I try to integrate MysqlDB the job hangs. 每次我尝试集成MysqlDB时,作业都会挂起。 It works fine with the DirectRunner and prior to me adding MySQLDB dependencies. 在我添加MySQLDB依赖项之前,它可以与DirectRunner一起使用。

Here's my setup.py 这是我的setup.py

I've added comments to the lines that appear to cause the DataFlowRunner to hang. 我已经添加注释到似乎导致DataFlowRunner挂起的行。

I've tried running the example wordcount with just the apt-get and the pypi dependency. 我尝试使用apt-getpypi依赖项来运行示例wordcount。

Expected result is to be able to add MySQL deps and still be able to run the wordcount exampe. 预期的结果是能够添加MySQL代表并仍然能够运行wordcount示例。

Update: See README for what I ended up doing. 更新:请参阅自述文件 ,了解我最终的目的。

For anyone else who goes down this rabbit hole just use mysql+pymysql as your URL if you're using SQLAlchemy. 如果你使用的是SQLAlchemy,那么对于那个陷入困境的人来说,只需使用mysql + pymysql作为你的URL。 If you're not using ORM just use pymysql. 如果您不使用ORM,请使用pymysql。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM