简体   繁体   中英

java 8 stream multiple filters

I am new to Java 8 and trying a requirement on Streams. I have a csv file with thousands of recods my csv format is

DepId,GrpId,EmpId,DepLocation,NoofEmployees,EmpType
===
D100,CB,244340,USA,1000,Contract
D101,CB,543126,USA,1900,Permanent
D101,CB,356147,USA,1800,Contract
D100,DB,244896,HK,500,SemiContract
D100,DB,543378,HK,100,Permanent

My requirement is to filter the records with two conditions a) EmpId starts with "244" or EmpId starts with "543" b) EmpType is "Contract" and "Permanent"

I tried below

 try (Stream<String> stream = Files.lines(Paths.get(fileAbsolutePath))) {
    list = stream                
        .filter(line -> line.contains("244") || line.contains("543"))
        .collect(Collectors.toList());
     }

It is filtering the employees based on 244 and 543 but my concern is since i am using contains it might fetch other data also ie it will fetch the data not only from EmpId column but also from other columns(other columns might also have data starting with these numbers)

similarly to incorporate EmpType as i am reading line by line there is no way for me to enforce that EmpType should be in "Permanent" and "Contract"

Am i missing any advanced options??

You can do it like so,

Pattern comma = Pattern.compile(",");
Pattern empNum = Pattern.compile("(244|543)\\d+");
Pattern empType = Pattern.compile("(Contract|Permanent)");
try (Stream<String> stream = Files.lines(Paths.get("C:\\data\\sample.txt"))) {
    List<String> result = stream.skip(2).map(l -> comma.split(l))
            .filter(s -> empNum.matcher(s[2]).matches())
            .filter(s -> empType.matcher(s[5]).matches())
            .map(s -> Arrays.stream(s).collect(Collectors.joining(",")))
            .collect(Collectors.toList());
    System.out.println(result);
} catch (IOException e) {
    e.printStackTrace();
}

First read the file and skip 2 header lines. Then split it using the , character. Filter it out using EmpId and EmpType . Next, merge the tokens back again to form the line, and finally collect each line into a List .

The elegant way is regex, which I would skip for now. The less elegant way using Stream API is as follows:

list = stream.filter(line -> {
    String empId = line.split(",")[2];
    return empId.startsWith("244") || empId.startsWith("543");
}.collect(Collectors.toList());

The shorter way with Stream API (pointed out by shmosel), is to use a mini regex.

list = stream.filter(line -> line.split(",")[2].matches("(244|543).*")
             .collect(Collectors.toList());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM