简体   繁体   English

Spark-Java:显示加入RDD

[英]Spark-Java : Display join RDD

I am trying to join two pairRDDs as show below and whereas 我正在尝试加入两个pairRDD,如下所示,而

lat1 : K,V -> k-integer , V- Double lat2 : K,V -> k-integer , V- Double lat1:K,V-> k整数,V-双重lat2:K,V-> k整数,V-双重

   JavaPairRDD<Integer,Tuple2<Double,Double>> latlong = lat.join(long);

Am assuming the new RDD will be K,[V1,V2] and i want to display the new RDD 假设新的RDD为K,[V1,V2],我想显示新的RDD

And also if i want to do operations based on value, what is the way to perform 而且如果我想基于价值进行操作,执行方法是什么

Please suggest in Spark-Java Api 请在Spark-Java Api中提出建议

Ps: I have seen many answers are in scala but my requirement is to implement in JAVa 附:我已经在scala中看到了很多答案,但是我的要求是在JAVa中实现

From Spark documentation: 从Spark文档:

When join called on datasets of type (K, V) and (K, W), returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key. 在对(K,V)和(K,W)类型的数据集进行join调用时,返回(K,(V,W))对的数据集,其中每个键都有所有成对的元素。

So you are right with this assumption: 因此,您对以下假设是正确的:

JavaPairRDD<Integer,Tuple2<Double,Double>> latlong = lat.join(long);

When you need to work with values in JavaPairRDD , you can use #mapValues() method: 当你需要在价值观工作JavaPairRDD ,你可以使用#mapValues()方法:

Pass each value in the key-value pair RDD through a map function without changing the keys; 通过映射函数传递键-值对RDD中的每个值,而无需更改键; this also retains the original RDD's partitioning. 这也保留了原始RDD的分区。

For displaying the JavaPairRDD you can use the same output methods as usual eg #saveAsTextFile() 为了显示JavaPairRDD您可以使用与通常相同的输出方法,例如#saveAsTextFile()


When you need to map values in (K, (V, W)) to something else like (K,VW) you can use the mentioned mapValues() transformation: 当您需要将(K, (V, W))值映射到(K,VW)类的其他值时(K,VW)可以使用上述mapValues()转换:

JavaPairRDD<Integer, String> pairs = latlong.mapValues(
        new Function<Tuple2<Double, Double>, String>() {
          @Override
          public String call(Tuple2<Double, Double> value) throws Exception {
            return value._1() + "-" + value._2();
          }
        });

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM