简体   繁体   English

Databricks + H2O PySparkling:addURL Py4JException

[英]Databricks + H2O PySparkling: addURL Py4JException

I am a newbie to H2O and spark framework and I am having troubles with on boarding H2O+Spark (sparkling-water) PySparkling in Databricks. 我是H2O和spark框架的新手,在Databricks中上H2O+Spark (sparkling-water) PySparkling时遇到了麻烦。 I have a 12 worker cluster running in Databricks in 1.5.2 environment. 我在1.5.2环境中的Databricks中运行了12个工人集群。

Steps I took were as following: 我采取的步骤如下:
1. Attach (Installed) necessary libraries (six, requests, tabulate, and future) required by H2O to my cluster 1.将H2O所需的必要库(六个,请求,列表和将来的库)附加(安装)到我的集群中

  1. Then, I took the necessary .egg file from sparkling-water-1.5.14/py/dist folder after unzipping it from the sparkling-water-1.5.14.zip package. 然后,从sparkling-water-1.5.14.zip包解压缩后,我从sparkling-water-1.5.14/py/dist文件夹中sparkling-water-1.5.14/py/dist了必要的.egg文件。

  2. I also attached the sparkling-water-assembly-1.5.14.jar to my Databricks cluster 我还将sparkling-water-assembly-1.5.14.jar到了我的Databricks集群

  3. I am able to import h2o successfully. 我能够成功import h2o however, when I run the following cell in my python NB in Databricks, I am getting exception below: 但是,当我在Databricks的python NB中运行以下单元格时,出现以下异常:

    Initiate H2OContext on top of Spark 在Spark上启动H2OContext
    from pysparkling import * hc = H2OContext(sc).start() import h2o

I am getting following error 我收到以下错误

py4j.Py4JException: Method addURL([class java.net.URL]) does not exist

Sincerely appreciate any guidance on how to resolve this exception. 衷心感谢您提供有关如何解决此异常的任何指导。

This is a bug in PySparkling . 这是PySparkling中错误 A fix has been already committed but is still waiting for the next release, might be introduced in 1.5.15. 已经提交了一个修复程序,但仍在等待下一个版本,可能在1.5.15中引入。

You can try building Sparkling Water from that branch yourself and use that before we release the next version. 您可以尝试从该分支机构自己构建起泡水,并在我们发布下一个版本之前使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM