简体   繁体   中英

How to preserve all Subgroups while applying nested groupingBy collector

I am trying to group a list of employees by the gender and department .

How do I ensure all departments are included in a sorted order for each gender , even when the relevant gender count is zero ?

Currently, I have the following code and output

employeeRepository.findAll().stream()
            .collect(Collectors.groupingBy(Employee::getGender, 
                        Collectors.groupingBy(Employee::getDepartment, 
                                              Collectors.counting())));

//output
//{MALE={HR=1, IT=1}, FEMALE={MGMT=1}}

Preferred output is:

{MALE={HR=1, IT=1, MGMT=0}, FEMALE={HR=0, IT=0, MGMT=1}}

To achieve that, first you have to group by department , and only then by gender , not the opposite.

The first collector groupingBy(Employee::getDepartment, _downstream_ ) will split the data set into groups based on department. As it downstream collector partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE, _downstream_ ) will be applied, it'll divide the data mapped to each department into two parts based on the employee gender. And finally, Collectors.counting() applied as a downstream will provide the total number of employees of each gender for every department .

So the intermediate map produced by the collect() operation will be of type Map<String, Map<Boolean, Long>> - employee count by gender ( Boolean ) for each department ( for simplicity, department is a plain string ).

The next step in transform this map into Map<Employee.Gender, Map<String, Long>> - employee count by department for each gender .

My approach is to create a stream over the entry set and replace each entry with a new one, which will hold a gender as its key and in order to preserve the information about a department its value in turn will be an entry with a department as a key and a with a count by department as its value.

Then collect the stream of entries with groupingBy by the entry key . Apply mapping as a downstream collector to extract the nested entry . And then apply Collectors.toMap() to collect entries of type Map.Entry<String, Long> into map.

all departments are included in a sorted order

To insure the order in the nested map ( department by count ) a NavigableMap should be used.

In order to do that, a flavor of toMap() that expects a mapFactory needs to be used ( it also expects a mergeFunction which isn't really useful for this task since there will be no duplicates, but it has to be provided as well ).

public static void main(String[] args) {
    List<Employee> employeeRepository = 
            List.of(new Employee("IT", Employee.Gender.MALE),
                    new Employee("HR", Employee.Gender.MALE),
                    new Employee("MGMT", Employee.Gender.FEMALE));

    Map<Employee.Gender, NavigableMap<String, Long>> departmentCountByGender = employeeRepository
            .stream()
            .collect(Collectors.groupingBy(Employee::getDepartment, // Map<String, Map<Boolean, Long>> - department to *employee count* by gender
                        Collectors.partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE,
                                                  Collectors.counting())))
            .entrySet().stream()
            .flatMap(entryDep -> entryDep.getValue().entrySet().stream()
                    .map(entryGen -> Map.entry(entryGen.getKey() ? Employee.Gender.MALE : Employee.Gender.FEMALE,
                                               Map.entry(entryDep.getKey(), entryGen.getValue()))))
            .collect(Collectors.groupingBy(Map.Entry::getKey,
                        Collectors.mapping(Map.Entry::getValue,
                                Collectors.toMap(Map.Entry::getKey,
                                                 Map.Entry::getValue,
                                                 (v1, v2) -> v1,
                                                 TreeMap::new))));

    System.out.println(departmentCountByGender);
}

Dummy Employee class used for demo-purposes:

class Employee {
    enum Gender {FEMALE, MALE};

    private String department;
    private Gender gender;
    // etc.
    
    // constructor, getters
}

Output

{FEMALE={HR=0, IT=0, MGMT=1}, MALE={HR=1, IT=1, MGMT=0}}

You can continue to work on the result of your code:

List<String> deptList = employees.stream().map(Employee::getDepartment).sorted().toList();

Map<Gender, Map<String, Long>> tmpResult = employees.stream()
        .collect(Collectors.groupingBy(Employee::getGender, Collectors.groupingBy(Employee::getDepartment, Collectors.counting())));

Map<Gender, Map<String, Long>> finalResult = new HashMap<>();

for (Map.Entry<Gender, Map<String, Long>> entry : tmpResult.entrySet()) {
    Map<String, Long> val = new LinkedHashMap<>();
    for (String dept : deptList) {
        val.put(dept, entry.getValue().getOrDefault(dept, 0L));
    }

    finalResult.put(entry.getKey(), val);
}

System.out.print(finalResult);

Probably readability or maintainability of code won't be good if you want to achieve result with one line of code.

However, there is one alternative if you don't mind to use third-party library: abacus-common

Map<Gender, Map<String, Integer>> result = Stream.of(employees)
        .groupByToEntry(Employee::getGender, MoreCollectors.countingIntBy(Employee::getDepartment)) // step 1) group by gender 
        .mapValue(it -> Maps.newMap(deptList, Fn.identity(), dept -> it.getOrDefault(dept, 0), IntFunctions.ofLinkedHashMap())) // step 2) process the value.
        .toMap();

Declaration: I'm the developer of abacus-common

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM