How to use groupby with counting function ?
I have following data and I want following output :
Desired output :
Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED
-----------------------------------------------
HCL | 4 | 2 | 1 | 1
TCS | 5 | 2 | 0 | 2
NOTE : I stored this data in hashmap. Key : OrderId, Value : OrderModel (PairName, staus)
I can achieve with following code. But I want more correct code. Please suggest me if you have other ways.
public static void main(String[] args) {
Map<Integer, Model> test = new HashMap<Integer, Model>();
Model m1 = new Model("HCL", "Inprogress");
Model m2 = new Model("HCL", "Cancel");
Model m3 = new Model("HCL", "Inprogress");
Model m4 = new Model("HCL", "Completed");
Model a1 = new Model("TCS", "Inprogress");
Model a2 = new Model("TCS", "Inprogress");
Model a3 = new Model("TCS", "Inprogress");
Model a4 = new Model("TCS", "Completed");
Model a5 = new Model("TCS", "Completed");
int count = 1;
test.put(count++, m1);
test.put(count++, m2);
test.put(count++, m3);
test.put(count++, m4);
test.put(count++, a1);
test.put(count++, a2);
test.put(count++, a3);
test.put(count++, a4);
test.put(count, a5);
Map<String, Model> countPair = new HashMap<String, Model>();
test.forEach((k, v) -> System.out.println("Key:" + k + " Pair:" + v.getName() + " Status:" + v.getStatus()));
System.out.println(" With Count !!!!");
List<Model> list = new ArrayList<Model>();
test.entrySet().stream().forEach(e -> list.add(e.getValue()));
// Total
Map<String, Long> counted = list.stream().collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -> {
countPair.put(k, new Model.ModelBuilder().setTotal(v).build());
});
// Inprogress
counted = list.stream().filter(e -> e.getStatus().equals("Inprogress"))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -> countPair.get(k).setInProgressCount(v));
// Cancel
counted = list.stream().filter(e -> e.getStatus().equals("Cancel"))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -> countPair.get(k).setCancelCount(v));
// Completed
counted = list.stream().filter(e -> e.getStatus().equals("Completed"))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -> countPair.get(k).setCompletedCount(v));
countPair.forEach((k, v) -> System.out.println("Pair : " + k + " : " + v.getTotal() + " , "
+ v.getInProgressCount() + " , " + v.getCancelCount() + " , " + v.getCompletedCount()));
}
Model :
public class Model {
private String name;
private String status;
private long total;
private long inProgressCount;
private long completedCount;
private long cancelCount;
static class ModelBuilder {
private long total;
private long inProgressCount;
private long completedCount;
private long cancelCount;
public ModelBuilder setTotal(long total) {
this.total = total;
return this;
}
public ModelBuilder setInProgressCount(long inProgressCount) {
this.inProgressCount = inProgressCount;
return this;
}
public ModelBuilder setCompletedCount(long completedCount) {
this.completedCount = completedCount;
return this;
}
public ModelBuilder setCancelCount(long cancelCount) {
this.cancelCount = cancelCount;
return this;
}
public Model build() {
return new Model(this);
}
}
public Model(ModelBuilder modelBuilder) {
this.total = modelBuilder.total;
this.inProgressCount = modelBuilder.inProgressCount;
this.completedCount = modelBuilder.completedCount;
this.cancelCount = modelBuilder.cancelCount;
}
public Model(String name, String status) {
super();
this.name = name;
this.status = status;
}
//getter and setter
}
I would do it like this:
List<Model> test = Arrays.asList(
new Model("HCL", "Inprogress"),
new Model("HCL", "Cancel"),
new Model("HCL", "Inprogress"),
new Model("HCL", "Completed"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Completed"),
new Model("TCS", "Completed")
);
Map<String, Map<String, Long>> result = test.stream()
.collect(Collectors.groupingBy(Model::getName,
Collectors.groupingBy(Model::getStatus,
Collectors.counting())));
result.entrySet().forEach(System.out::println);
Output
HCL={Cancel=1, Completed=1, Inprogress=2}
TCS={Completed=2, Inprogress=3}
Shouldn't be a problem using that result
object to produce the desired output, since all the heavy lifting has already been done.
System.out.println("Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED");
System.out.println("-----------------------------------------------");
for (Entry<String, Map<String, Long>> nameEntry : result.entrySet()) {
String name = nameEntry.getKey();
Map<String, Long> statusCounts = nameEntry.getValue();
long inprogress = statusCounts.getOrDefault("Inprogress", 0L);
long cancel = statusCounts.getOrDefault("Cancel" , 0L);
long completed = statusCounts.getOrDefault("Completed" , 0L);
System.out.printf("%-5s | %5d | %10d | %6d | %8d%n", name,
inprogress + cancel + completed,
inprogress, cancel, completed);
}
Output
Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED
-----------------------------------------------
HCL | 4 | 2 | 1 | 1
TCS | 5 | 3 | 0 | 2
Note: Comparing that to the "Desired output" in the question, we observe that the question example is wrong, since TCS, INPROGRESS
is 3, not 2.
I would agree with Andreas on the part about best stream solution. But becouse you tagged it with performance i would say that we should compare execution times so i did compare vijayk solution with Andreas solution and with solution without stream.
Here are the results for 9 objects in list:
Total execution with stream in ms: 7.642649ms
Total execution with for each in ms: 0.037637ms
Total execution with Andreas stream in ms: 0.392906ms
foreach was faster than vijayk by : 203.06211972261337
Andreas was faster than vijayk by : 19.451596565081726
foreach was faster than Andreas by : 10.439354890134707
Here are the results for 18 000 000 objects in list:
Total execution with stream in ms: 703.025082ms
Total execution with for each in ms: 278.319758ms
Total execution with Andreas stream in ms: 504.190017ms
foreach was faster than vijayk by : 2.5259618183485197
Andreas was faster than vijayk by : 1.3943653350835783
foreach was faster than Andreas by : 1.8115494948080546
Then I changed stream to parallel stream and results look like that
For 9 objects in list:
Total execution with stream in ms: 20.937947ms
Total execution with for each in ms: 0.042329ms
Total execution with Andreas stream in ms: 0.496791ms
foreach was faster than vijayk by : 494.64780646837863
Andreas was faster than vijayk by : 42.14638952799064
foreach was faster than Andreas by : 11.736421838455906
For 18 000 000 objects in list:
Total execution with stream in ms: 476.563756ms
Total execution with for each in ms: 278.438998ms
Total execution with Andreas stream in ms: 302.730519ms
foreach was faster than vijayk by : 1.7115553475738337
Andreas was faster than vijayk by : 1.5742177484259523
foreach was faster than Andreas by : 1.087241805833535
For each solution looks like that:
for (Model item : list) {
Model itemToIncrease;
if (countPair.containsKey(item.getName())) {
itemToIncrease = countPair.get(item.getName());
} else {
countPair.put(item.getName(), item);
itemToIncrease = item;
}
itemToIncrease.increaseTotal();
switch (item.getStatus()) {
case "Inprogress":
itemToIncrease.increaseInProgressCount();
break;
case "Cancel":
itemToIncrease.increaseCancelCount();
break;
case "Completed":
itemToIncrease.increaseCompletedCount();
break;
}
}
To summ it up i would say that Andreas solution is very good when you have a lot of data
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.