简体   繁体   English

Spark Scala Cassandra

[英]Spark scala cassandra

Please see the below code and let me know where I am doing it wrong? 请查看以下代码,让我知道我在哪里做错了?

Using: 使用:

DSE Version - 5.1.0 DSE版本-5.1.0

Connected to Test Cluster at 172.31.16.45:9042. 在172.31.16.45:9042连接到测试集群。 [cqlsh 5.0.1 | [cqlsh 5.0.1 | Cassandra 3.10.0.1652 | 卡桑德拉3.10.0.1652 | DSE 5.1.0 | DSE 5.1.0 | CQL spec 3.4.4 | CQL规范3.4.4 | Native protocol v4] Use HELP for help. [原始协议v4]使用帮助获得帮助。

Thanks 谢谢

Cassandra Table :
cqlsh:tdata> select * from map;

 sno | name
-----+------
   1 |  One
   2 |  Two 

------------------------------------------- -------------------------------------------

scala> :showSchema tdata ======================================== Keyspace: tdata ======================================== Table: map ---------------------------------------- - sno : Int (partition key column) - name : String scala>:showSchema tdata ======================================键空间:tdata == =====================================表:地图--------- --------------------------------sno:Int(分区键列)-名称:String

scala> val rdd = sc.cassandraTable("tdata", "map") scala> val rdd = sc.cassandraTable(“ tdata”,“ map”)

scala> rdd.foreach(println) scala> rdd.foreach(println)

I am not getting anything here? 我什么都没收到? Not even an error. 甚至没有错误。

You have hit a very common spark issue. 您遇到了一个非常常见的火花问题。 Your println code is being executed on your remote executor JVMs. 您的println代码正在远程executor JVM上executor That means the printout is to the STDOUT of the executor JVM process. 这意味着打印输出到executor JVM进程的STDOUT If you want to bring the data back to the driver JVM before printing you need a collect call. 如果要在打印之前将数据带回driver JVM,则需要一个collect调用。

rdd
 .collect //Change from RDD to local collection
 .foreach(println)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM