简体   繁体   English

将RDD [CassandraRow]转换为RDD [String]

[英]convert RDD[CassandraRow] to RDD[String]

is it possible to convert RDD[CassandraRow] to RDD[String] ? 是否可以将RDD [CassandraRow]转换为RDD [String]? if so , is there any disadvantage of working against the converted RDD ? 如果是这样,使用转换后的RDD是否有任何不利之处?

You can use sqlContext to read data from Cassandra table, it returns an DataFrame, and when you read text file using sparkContext it returns RDD and then you can convert that to DataFrame. 您可以使用sqlContext从Cassandra表中读取数据,它返回一个DataFrame,并且当您使用sparkContext读取文本文件时,它返回RDD,然后可以将其转换为DataFrame。

If your text files are CSV, Spark 2.0 Supports csv data source, it returns an DataFrame by deafult. 如果您的文本文件是CSV,则Spark 2.0支持csv数据源,默认情况下会返回DataFrame。 Please see this.. https://spark.apache.org/releases/spark-release-2-0-0.html#new-features and https://github.com/databricks/spark-csv/issues/ 请参阅此。.https ://spark.apache.org/releases/spark-release-2-0-0.html#new-featureshttps://github.com/databricks/spark-csv/issues/

Update: 更新:

https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM