简体繁体 English

像普通Seq一样对待Spark RDD

[英]Treat Spark RDD like plain Seq

原文 2015-08-26 17:40:01 4 1 scala/ apache-spark/ functional-programming/ rdd

I have a CLI application for transforming JSONs. 我有一个用于转换JSON的CLI应用程序。 Most of it's code is map ping, flatMap ping and traversing with for Lists of JValues. 大部分的代码是map平， flatMap平与穿越for JValues名单。 Now I want to port this application to Spark, but seems I need to rewrite all functions 1:1, but write RDD[JValue] instead of List[JValue] . 现在我想将此应用程序移植到Spark，但似乎我需要重写所有函数1：1，但是编写RDD[JValue]而不是List[JValue] 。

Is there any way (like type class) for function to accept both Lists and RDDs. 函数是否有任何方式（如类型类）接受列表和RDD。

1 个解决方案

If you want to share your code for processing local & abstract code you can move your lambdas/anaonymous functions that you pass in to map / flatMap into named functions and re-use them. 如果要共享用于处理本地和抽象代码的代码，可以将传入的lambdas / anaonymous函数移动到map / flatMap并重新使用它们。

If you want to re-use your logic for how to order the maps/flatMaps/etc, you could also create an implicit conversions between both RDD and Seq to a custom trait which has only the shared functions but implicit conversions can become quite confusing and I don't really think this is a good idea (but you could do it if you disagree with me :)). 如果您想重新使用逻辑来定义maps / flatMaps / etc，您还可以在RDD和Seq之间创建一个隐式转换到自定义特征，该特征只有共享函数，但隐式转换可能会变得非常混乱，我不认为这是一个好主意（但如果你不同意我，你可以这样做:)）。

Spark：如何将RDD的Seq转换为RDD - Spark: How to transform a Seq of RDD into a RDD

如何在Spark Rdd中转换Seq - How can i convert a Seq in a Spark Rdd

spark.createDataFrame（）无法与Seq RDD一起使用 - spark.createDataFrame () not working with Seq RDD

Spark：如何将 RDD 转换为要在管道中使用的 Seq - Spark: How to transform a RDD to Seq to be used in pipeline

Spark如何将RDD [Seq [（String，String）]]转换为RDD [（String，String）] - Spark how to transform RDD[Seq[(String, String)]] to RDD[(String, String)]

如何使用 Spark RDD 中的唯一值填充 Scala Seq of Sets？ - How to fill Scala Seq of Sets with unique values from Spark RDD?

使用Spark Streaming上下文时如何将Seq转换为RDD - How to convert Seq to RDD when working with spark streaming context

火花模式rdd到RDD - spark schema rdd to RDD

我可以从Spark程序而不是从RDD编写纯文本HDFS（或本地）文件吗？ - Can I write a plain text HDFS (or local) file from a Spark program, not from an RDD?

Scala Seq for Java中的Spark？ - Scala Seq for Spark in Java?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark：如何将RDD的Seq转换为RDD - Spark: How to transform a Seq of RDD into a RDD 如何在Spark Rdd中转换Seq - How can i convert a Seq in a Spark Rdd spark.createDataFrame（）无法与Seq RDD一起使用 - spark.createDataFrame () not working with Seq RDD Spark：如何将 RDD 转换为要在管道中使用的 Seq - Spark: How to transform a RDD to Seq to be used in pipeline Spark如何将RDD [Seq [（String，String）]]转换为RDD [（String，String）] - Spark how to transform RDD[Seq[(String, String)]] to RDD[(String, String)] 如何使用 Spark RDD 中的唯一值填充 Scala Seq of Sets？ - How to fill Scala Seq of Sets with unique values from Spark RDD? 使用Spark Streaming上下文时如何将Seq转换为RDD - How to convert Seq to RDD when working with spark streaming context 火花模式rdd到RDD - spark schema rdd to RDD 我可以从Spark程序而不是从RDD编写纯文本HDFS（或本地）文件吗？ - Can I write a plain text HDFS (or local) file from a Spark program, not from an RDD? Scala Seq for Java中的Spark？ - Scala Seq for Spark in Java?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM