简体   繁体   中英

Apache Spark RDD and Java 8: Exception handling

I need to skip the record, if i get any exception while iterating the file content using Java 8 and Spark.

I do not want to throw exception, i just need to skip that record and continue with other records.

Code Example is :

JavaRDD<Model> fileRDD = sc.textFile("filePath")
                .map(line -> {
                    try {
                    String[] parts = line.split("\\|");
                    Long key = Long.parseLong(parts[0];
                    return line;
                    } catch (NumberFormatException nfe) {
                        //if i throw RuntimeException, its working file
                        //but i dont want to throw exception, i want to just skip the line,
                        // how do i do it using java 8 stream methods
                    }
                });

You can use filter instead of map :

JavaRDD<Model> fileRDD = sc.textFile("filePath")
            .filter(line -> {
                try {
                    String[] parts = line.split("\\|");
                    Long key = Long.parseLong(parts[0];
                    return true;
                } catch (NumberFormatException nfe) {
                    return false;
                }
            });

String[] parts = line.split("|");

The pipe character should be escaped.

String[] parts = line.split("\\\\|");

See: https://stackoverflow.com/a/9808719/3662739

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM