简体   繁体   English

Spark 3.0.0 创建 SparkSession 时出错:pyspark.sql.utils.IllegalArgumentException:<exception str() failed></exception>

[英]Spark 3.0.0 error creating SparkSession: pyspark.sql.utils.IllegalArgumentException: <exception str() failed>

I'm trying to build Spark 3.0.0 for my Yarn cluster, with Hadoop 2.7.3 and Hive 1.2.1.我正在尝试使用 Hadoop 2.7.3 和 Hive 1.2.1 为我的 Yarn 集群构建 Spark 3.0.0。 I downloaded the source and created a runnable dist with我下载了源代码并创建了一个可运行的 dist

./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive-1.2 -Phadoop-2.7 -Pyarn

We're running Spark 2.4.0 in production so I copied the hive-site.xml, spark-env.sh and spark-defaults.conf from there.我们在生产中运行 Spark 2.4.0,所以我从那里复制了 hive-site.xml、spark-env.sh 和 spark-defaults.conf。

When I try to create a SparkSession in a normal Python REPL, I get the following uninformative error.当我尝试在普通 Python REPL 中创建 SparkSession 时,我收到以下无信息错误。 How can I debug this?我该如何调试呢? I can run the spark-shell and get to a scala prompt with Hive access seemingly without error.我可以运行 spark-shell 并进入 scala 提示符,其中 Hive 访问似乎没有错误。

Python 3.6.3 (default, Apr 10 2018, 16:07:04)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import sys
>>> os.environ['SPARK_HOME'] = '/home/pmccarthy/custom-spark-3'
>>> sys.path.insert(0,os.path.join(os.environ['SPARK_HOME'],'python','lib','py4j-src.zip'))
>>> sys.path.append(os.path.join(os.environ['SPARK_HOME'],'python'))
>>> import pyspark
>>> from pyspark.sql import SparkSession
>>> spark = (SparkSession.builder.enableHiveSupport().config('spark.master','local').getOrCreate())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pmccarthy/custom-spark-3/python/pyspark/sql/session.py", line 191, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File "/home/pmccarthy/custom-spark-3/python/lib/py4j-src.zip/py4j/java_gateway.py", line 1305, in __call__
  File "/home/pmccarthy/custom-spark-3/python/pyspark/sql/utils.py", line 137, in deco
    raise_from(converted)
  File "<string>", line 3, in raise_from
pyspark.sql.utils.IllegalArgumentException: <exception str() failed>

I also encountered pyspark.sql.utils.IllegalArgumentException: <exception str() failed> .我还遇到了pyspark.sql.utils.IllegalArgumentException: <exception str() failed> It was caused on my side by using an option removed in Spark 3 ( spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation ).这是由于我使用 Spark 3 中删除的选项( spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation )造成的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM