简体   繁体   中英

Converting JavaRDD<List<String>> to JavaPairRDD<String, Integer>

I have a JavaRDD<List<String>> and I want it to become a JavaPairRDD<String, Integer> , where the String is each element included in the lists of the original JavaRDD, and the Integer is a constant (1). Is it possible to do something like that? PS: already checked this question , but didn't help me.

Please use flatMapToPair

        JavaRDD<List<String>> rdd = ...;

        JavaPairRDD<String, Integer> flatMapToPair = rdd.flatMapToPair(new PairFlatMapFunction<List<String>, String, Integer>() {

            @Override
            public Iterable<Tuple2<String, Integer>> call(List<String> t) throws Exception {
                List<Tuple2<String, Integer>> result = new ArrayList<>();
                for (String str : t) {
                    result.add(new Tuple2<>(str, 1));
                }
                return result;
            }
        });

You can use:

JavaRDD<List<String>> listRdd = null; //assign
JavaPairRDD<String, Integer> rdd = listRdd.flatMap(list -> list)
     .mapToPair(string -> new Tuple2<String, Integer>(string, 1));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM