簡體   English   中英

Py4JJavaError:調用 o41.load 時出錯。 : java.lang.ClassNotFoundException:

[英]Py4JJavaError: An error occurred while calling o41.load. : java.lang.ClassNotFoundException:

這是我收到錯誤的代碼*

df=spark.read.format("xml").option("rowTag","Root").load("/content/xml")

我想在沒有任何其他平台(即databricks或azure)的情況下使用pyspark解析xml我也嘗試通過下載spark-xml形式的mvn存儲庫的jar文件來嘗試它的代碼是

spark=SparkSession.builder.appName("Apache spark using pyspark")\
.config("spark jars","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")\
.config("spark.executor.extraClassPath","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")\
.config("spark.executor.extraLibrary","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")\
.config("spark.driver.extraClassPath","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")\
.getOrCreate()

在這也是我在 spark.read.format 行遇到同樣的錯誤

這是我每次都遇到的錯誤

Py4JJavaError                             Traceback (most recent call last)
<ipython-input-22-59ae75a30984> in <module>
----> 1 df=spark.read.format("xml").option("rowTag","Root").load("/content/spark_/sample_corrupted.xml")

3 frames
/usr/local/lib/python3.8/dist-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    324             value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325             if answer[1] == REFERENCE_TYPE:
--> 326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
    328                     format(target_id, ".", name), value)

Py4JJavaError: An error occurred while calling o125.load.
: java.lang.ClassNotFoundException: 
Failed to find data source: xml. Please find packages at
https://spark.apache.org/third-party-projects.html
       
    at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:587)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:675)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:725)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:207)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.ClassNotFoundException: xml.DefaultSource
    at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
    at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:661)
    at scala.util.Try$.apply(Try.scala:213)
    at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:661)
    at scala.util.Failure.orElse(Try.scala:224)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:661)
    ... 15 more

有沒有其他方法可以做到這一點?

看起來你錯過了. spark.jars中。 嘗試改變

.config("spark jars","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")

.config("spark.jars","C:/Users/baps/Downloads/spark-xml_2.12-0.9.0.jar")

希望這可以幫助!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM