简体   繁体   中英

How to convert JavaRDD<List<String>> to JavaRDD<String> and write to a file without "[" and "]"

I have a JavaRDD<List<String>> and my file is getting written with [] at the beginning and end of each list of strings when I use

javacontext.parallelize(rdd).coalesce(1, true).saveAsTextFile("dirname");

Can we convert JavaRDD<List<String>> to JavaRDD<String> and write it to a file?

You could use map to apply String.join for each List<String> in JavaRDD :

String separator = ",";
JavaRDD<String> ys = rdd
        .map(new Function<List<String>, String>() {
            @Override
            public String call(List<String> xs) throws Exception {
                return String.join(separator, xs);
            }
        });

Or using lambdas:

JavaRDD<String> ys = rdd
        .map((Function<List<String>, String>) xs -> String.join(separator, xs));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM