How to avoid multiple Streams with Java 8

Question

I am having the below code

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Company"))
        .findFirst()
        .ifPresent(id -> {
            domainResp.setId(id.getId());
        });

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Private"))
        .findFirst()
        .ifPresent(id ->
            domainResp.setPrivateId(id.getId())
        );

Here I'm iterating/streaming the list of Id objects 2 times.

The only difference between the two streams is in the filter() operation.

How to achieve it in single iteration , and what is the best approach ( in terms of time and space complexity ) to do this?

Answer 1

You can achieve that with Stream IPA in one pass though the given set of data and without increasing memory consumption ( ie the result will contain only id s having required attributes ).

For that, you can create a custom Collector that will expect as its parameters a Collection attributes to look for and a Function responsible for extracting the attribute from the stream element.

That's how this generic collector could be implemented.

/** *
 * @param <T> - the type of stream elements
 * @param <F> - the type of the key (a field of the stream element)
 */
class CollectByKey<T, F> implements Collector<T, Map<F, T>, Map<F, T>> {
    private final Set<F> keys;
    private final Function<T, F> keyExtractor;
    
    public CollectByKey(Collection<F> keys, Function<T, F> keyExtractor) {
        this.keys = new HashSet<>(keys);
        this.keyExtractor = keyExtractor;
    }
    
    @Override
    public Supplier<Map<F, T>> supplier() {
        return HashMap::new;
    }
    
    @Override
    public BiConsumer<Map<F, T>, T> accumulator() {
        return this::tryAdd;
    }
    
    private void tryAdd(Map<F, T> map, T item) {
        F key = keyExtractor.apply(item);
        if (keys.remove(key)) {
            map.put(key, item);
        }
    }
    
    @Override
    public BinaryOperator<Map<F, T>> combiner() {
        return this::tryCombine;
    }
    
    private Map<F, T> tryCombine(Map<F, T> left, Map<F, T> right) {
        right.forEach(left::putIfAbsent);
        return left;
    }
    
    @Override
    public Function<Map<F, T>, Map<F, T>> finisher() {
        return Function.identity();
    }
    
    @Override
    public Set<Characteristics> characteristics() {
        return Collections.emptySet();
    }
}

main() - demo (dummy Id class is not shown)

public class CustomCollectorByGivenAttributes {
    public static void main(String[] args) {
        List<Id> ids = List.of(new Id(1, "Company"), new Id(2, "Fizz"),
                               new Id(3, "Private"), new Id(4, "Buzz"));
        
        Map<String, Id> idByType = ids.stream()
                .collect(new CollectByKey<>(List.of("Company", "Private"), Id::getType));
        
        idByType.forEach((k, v) -> {
            if (k.equalsIgnoreCase("Company")) domainResp.setId(v);
            if (k.equalsIgnoreCase("Private")) domainResp.setPrivateId(v);
        });
    
        System.out.println(idByType.keySet()); // printing keys - added for demo purposes
    }
}

Output

[Company, Private]

Note , after the set of keys becomes empty (ie all resulting data has been fetched) the further elements of the stream will get ignored, but still all remained data is required to be processed.

Answer 2

IMO, the two streams solution is the most readable. And it may even be the most efficient solution using streams.

IMO, the best way to avoid multiple streams is to use a classical loop. For example:

// There may be bugs ...

boolean seenCompany = false;
boolean seenPrivate = false;
for (Id id: getIds()) {
   if (!seenCompany && id.getType().equalsIgnoreCase("Company")) {
      domainResp.setId(id.getId());
      seenCompany = true;
   } else if (!seenPrivate && id.getType().equalsIgnoreCase("Private")) {
      domainResp.setPrivateId(id.getId());
      seenPrivate = true;
   }
   if (seenCompany && seenPrivate) {
      break;
   }
}

It is unclear whether that is more efficient to performing one iteration or two iterations. It will depend on the class returned by getIds() and the code of iteration.

The complicated stuff with two flags is how you replicate the short circuiting behavior of findFirst() in your 2 stream solution. I don't know if it is possible to do that at all using one stream. If you can, it will involve something pretty cunning code.

But as you can see your original solution with 2 stream is clearly easier to understand than the above.

The main point of using streams is to make your code simpler. It is not about efficiency. When you try to do complicated things to make the streams more efficient, you are probably defeating the (true) purpose of using streams in the first place.

Answer 3

For your list of ids, you could just use a map, then assign them after retrieving, if present.

Map<String, Integer> seen = new HashMap<>();

for (Id id : ids) {
    if (seen.size() == 2) {
        break;
    }
    seen.computeIfAbsent(id.getType().toLowerCase(), v->id.getId());
}

If you want to test it, you can use the following:

record Id(String getType, int getId) {
    @Override
    public String toString() {
        return String.format("[%s,%s]", getType, getId);
    }
}

Random r = new Random();
List<Id> ids = r.ints(20, 1, 100)
        .mapToObj(id -> new Id(
                r.nextBoolean() ? "Company" : "Private", id))
        .toList();

Edited to allow only certain types to be checked

If you have more than two types but only want to check on certain ones, you can do it as follows.

the process is the same except you have a Set of allowed types.
You simply check to see that your are processing one of those types by using contains .

Map<String, Integer> seen = new HashMap<>();

Set<String> allowedTypes = Set.of("company", "private");
for (Id id : ids) {
    String type = id.getType();

    if (allowedTypes.contains(type.toLowerCase())) {
        if (seen.size() == allowedTypes.size()) {
            break;
        }
        seen.computeIfAbsent(type,
                v -> id.getId());
    }
}

Testing is similar except that additional types need to be included.

create a list of some types that could be present.
and build a list of them as before.
notice that the size of allowed types replaces the value 2 to permit more than two types to be checked before exiting the loop.

List<String> possibleTypes = 
      List.of("Company", "Type1", "Private", "Type2");
Random r = new Random();
List<Id> ids =
        r.ints(30, 1, 100)
                .mapToObj(id -> new Id(possibleTypes.get(
                        r.nextInt((possibleTypes.size()))),
                        id))
                .toList();

Answer 4

You can group by type and check the resulting map. I suppose the type of ids is IdType .

Map<String, List<IdType>> map = trainResponse.getIds()
                                .stream()
                                .collect(Collectors.groupingBy(
                                                     id -> id.getType().toLowerCase()));

Optional.ofNullable(map.get("company")).ifPresent(ids -> domainResp.setId(ids.get(0).getId()));
Optional.ofNullable(map.get("private")).ifPresent(ids -> domainResp.setPrivateId(ids.get(0).getId()));

Answer 5

I'd recommend a traditionnal for loop. In addition of being easily scalable, this prevents you from traversing the collection multiple times. Your code looks like something that'll be generalised in the future, thus my generic approch.

Here's some pseudo code (with errors, just for the sake of illustration)

Set<String> matches = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
for(id : trainResponse.getIds()) {

    if (! matches.add(id.getType())) {
        continue;
    }

    switch (id.getType().toLowerCase()) {

        case "company":
            domainResp.setId(id.getId());
            break;

        case "private":
            ...
    }
}

Answer 6

Something along these lines can might work, it would go through the whole stream though, and won't stop at the first occurrence. But assuming a small stream and only one Id for each type, why not?

Map<String, Consumer<String>> setters = new HashMap<>();
setters.put("Company", domainResp::setId);
setters.put("Private", domainResp::setPrivateId);

trainResponse.getIds().forEach(id -> {
    if (setters.containsKey(id.getType())) {
        setters.get(id.getType()).accept(id.getId());
    }
});

Answer 7

We can use the Collectors.filtering from Java 9 onwards to collect the values based on condition.

For this scenario, I have changed code like below

final Map<String, String> results = trainResponse.getIds()
            .stream()
            .collect(Collectors.filtering(
                id -> id.getType().equals("Company") || id.getIdContext().equals("Private"),
                Collectors.toMap(Id::getType, Id::getId, (first, second) -> first)));

And getting the id from results Map.

How to avoid multiple Streams with Java 8

Question

7 answers

solution1
2 ACCPTED 2022-04-08 13:00:30

solution2
1 2022-04-08 06:33:52

solution3
1 2022-04-09 14:37:57

solution4
0 2022-04-08 07:00:48

solution5
0 2022-04-08 07:09:11

solution6
0 2022-04-08 10:01:21

solution7
0 2022-04-11 12:59:19

How to avoid multiple Streams with Java 8

Question

7 answers

solution1 2 ACCPTED 2022-04-08 13:00:30

solution2 1 2022-04-08 06:33:52

solution3 1 2022-04-09 14:37:57

solution4 0 2022-04-08 07:00:48

solution5 0 2022-04-08 07:09:11

solution6 0 2022-04-08 10:01:21

solution7 0 2022-04-11 12:59:19

solution1
2 ACCPTED 2022-04-08 13:00:30

solution2
1 2022-04-08 06:33:52

solution3
1 2022-04-09 14:37:57

solution4
0 2022-04-08 07:00:48

solution5
0 2022-04-08 07:09:11

solution6
0 2022-04-08 10:01:21

solution7
0 2022-04-11 12:59:19