简体   繁体   English

Databricks 连接到 IntelliJ + python 线程“主”java.lang.NoSuchMethodError 中的错误异常:

[英]Databricks connect to IntelliJ + python Error Exception in thread “main” java.lang.NoSuchMethodError:

I trying to connect my databricks with my IDE我试图将我的数据块与我的 IDE

I do not have spark ad/or scala downloaded on my machine, but I did download pyspark (pip install pyspark).我的机器上没有下载 spark ad/或 scala,但我确实下载了 pyspark(pip install pyspark)。 I consturcted the necessary environmental variables and made a folder Hadoop, in which I placed a folder bin, in which I placed a winutils.exe file.我构造了必要的环境变量并创建了一个文件夹 Hadoop,我在其中放置了一个文件夹 bin,其中我放置了一个 winutils.exe 文件。

This was a step-wise process in which slowsly but steadily all my errors were solved, except for the last one:这是一个循序渐进的过程,在这个过程中,我的所有错误都缓慢而稳定地得到解决,除了最后一个错误:

import logging
from pyspark.sql import SparkSession
from pyspark import SparkConf

if __name__ == "__main__":
    spark = SparkSession.builder.getOrCreate()
    spark.sparkContext.setLogLevel("OFF")

Gives

1/03/30 15:14:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Exception in thread "main" java.lang.NoSuchMethodError: py4j.GatewayServer$GatewayServerBuilder.securityManager(Lpy4j/security/Py4JSecurityManager;)Lpy4j/GatewayServer$GatewayServerBuilder;
    at org.apache.spark.api.python.Py4JServer.<init>(Py4JServer.scala:68)
    at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:37)
    at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

So the first warning is probably due to the fact that I do not have hadoop/spark installed.所以第一个警告可能是因为我没有安装 hadoop/spark。 However, I read that as long as the windows executble winutils.exe is in the bin folder of Hadoop, this should work.但是,我读到只要 windows 可执行文件 winutils.exe 在 Hadoop 的 bin 文件夹中,这应该可以工作。 (before I had the winutils in that folder, other errors arose, I dealt with those by adding the winutils.exe file) So it is about the Exception in thread 'main' error. (在我在那个文件夹中有winutils之前,出现了其他错误,我通过添加winutils.exe文件来处理这些错误)所以这是关于线程'main'错误中的异常。

Any idea?任何想法?

You need to uninstall PySpark as it's described in documentation .您需要按照文档中的说明卸载 PySpark。 Per documentation:根据文档:

Having both installed will cause errors when initializing the Spark context in Python.在 Python 中初始化 Spark 上下文时,两者都安装会导致错误。 This can manifest in several ways, including “stream corrupted” or “ class not found ” errors.这可以通过多种方式表现出来,包括“流损坏”或“未找到 class ”错误。 If you have PySpark installed in your Python environment, ensure it is uninstalled before installing databricks-connect .如果您在 Python 环境中安装了 PySpark,请确保在安装 databricks-connect 之前将其卸载

so you need to do:所以你需要做:

pip uninstall pyspark
pip uninstall databricks-connect
pip install -U databricks-connect==5.5.*  # or X.Y.* to match your cluster version.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark java.lang.NoSuchMethodError - Spark java.lang.NoSuchMethodError Databricks 连接 java.lang.ClassNotFoundException - Databricks Connect java.lang.ClassNotFoundException 使用 databricks-connect 连接到 python 中的数据块时出错 - Error connecting to databricks in python with databricks-connect (PySpark)路径错误:线程“ main”中的异常java.io.ioexception无法运行程序“ python” - (PySpark) Pathing error: exception in thread “main” java.io.ioexception cannot run program “python” java.lang.NoSuchMethodError: org.apache.spark.internal.Logging.$init$ - java.lang.NoSuchMethodError: org.apache.spark.internal.Logging.$init$ 线程“main”中的异常java.lang.UnsatisfiedLinkError:java.library.path中没有jep - Exception in thread “main” java.lang.UnsatisfiedLinkError: no jep in java.library.path 在Python的主线程中捕获线程的异常 - Catch a thread's exception in the main thread in Python 奇怪的错误 - python manage.py runserver(线程django-main-thread中的异常) - Weird error with - python manage.py runserver (Exception in thread django-main-thread) python aiohttp性能:在主线程上执行连接 - python aiohttp performance: connect performed on the main thread 使用 toPandas() 和 databricks 连接时遇到“java.lang.OutOfMemoryError: Java 堆空间” - Running into 'java.lang.OutOfMemoryError: Java heap space' when using toPandas() and databricks connect
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM