如何在Windows中使用Scala将Cassandra与Spark连接

Question

I am trying to connect Spark and Cassandra using Scala as described here http://www.planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/ I am facing errors in the steps under the heading: 我正在尝试使用Scala将Spark和Cassandra连接起来，如下所述：http://www.planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/我在标题下的步骤中遇到错误：

"To load the connector into the Spark Shell:" “要将连接器加载到Spark Shell中：”

val test_spark_rdd = sc.cassandraTable(“test_spark”, “test”) val test_spark_rdd = sc.cassandraTable（“ test_spark”，“ test”）

test_spark_rdd.first while using above command(Bold) 使用上面的命令时test_spark_rdd.first （粗体）

it shows the error Exception in task 0.0 in stage 0.0 (TID 0) java.lang.NullPointerException 它在阶段0.0（TID 0）java.lang.NullPointerException中显示任务0.0中的错误Exception

i have uploaded complete stack trace here 我在这里上传了完整的堆栈跟踪

https://docs.google.com/document/d/1UjGXKifD6chq7-WrHd3GT3LoNcw8GawxAPeOtiEjKvM/edit?usp=sharing https://docs.google.com/document/d/1UjGXKifD6chq7-WrHd3GT3LoNcw8GawxAPeOtiEjKvM/edit?usp=sharing

Some rpc settings from the cassandra.YAML file are: cassandra.YAML文件中的一些rpc设置是：

rpc_address: localhost 
# rpc_interface: eth1 
# rpc_interface_prefer_ipv6: false 
# port for Thrift to listen for clients on 
rpc_port: 9160

My spark-defaults config file 我的spark-defaults配置文件

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
#spark.serializer                 org.apache.spark.serializer.KryoSerializer
#spark.driver.memory              5g
#spark.executor.extraJavaOptions  -XX:+PrintGCDetails -#Dkey=value -Dnumbers="one two three"
spark.cassandra.connection.host localhost

Answer 1

15/08/04 21:24:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
    java.lang.NullPointerException
            at java.lang.ProcessBuilder.start(Unknown Source)
            at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
            at org.apache.hadoop.util.Shell.run(Shell.java:418)
            at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
            at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
            at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)

Looks like the issue is that the underlying forked executor process failed to start up or do something with the local filesystem. 看起来问题在于底层的分叉执行程序进程无法启动或无法对本地文件系统执行某些操作。 Make sure the default spark directories are accessible by the Executor Process. 确保执行者进程可以访问默认的spark目录。

如何在Windows中使用Scala将Cassandra与Spark连接

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-08-04 19:10:16

如何在Windows中使用Scala将Cassandra与Spark连接

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-08-04 19:10:16

解决方案1
0 已采纳 2015-08-04 19:10:16