Why does the Java Stream API omit the 'toArray(T[] a)' overload for writing to an existing array?

Question

While the Collection API provides three overloads of toArray

Object[] toArray()
T[] toArray(IntFunction<T[]> generator)
T[] toArray(T[] a)

the Stream API only provides the first two of those. That is, it does not provide a way to use an existing array as the return value of the stream-to-array conversion.

--> What is the reason for the omission of toArray(T[] a) in Stream ?

I imagine the main reason is the functional spirit of streams, which would make a side effect like writing to an existing array undesired. Is this correct? Are there other reasons that would make the other two versions preferrable? Maybe there are even reasons that make them also preferrable on a Collection ?

Answer 1

The common pattern (at least according to my observations) for the use of Collection.toArray(T[]) was this one:

var array = list.toArray( new T[0] );

meaning that an empty array with size 0 was provided.

Calling toArray() with an existing array allowed to return an array of the proper type, in opposite to an array of Object . There was no other way to give the type of the desired return type into the method than via a 'sample'. You can think about like toArray( Class<T> elementType ) instead, but that does not work proper if T is a parameterised type as well.

When Lambda were introduced, the variant with the IntFunction<T[]> replaced the variant with the existing array in its functionality, therefore it was omitted for Stream . I expect that Collection.toArray(T[]) will be deprecated soon, and will be removed with one of the upcoming LTS versions – not with the next, or that one after the next, of course!

Answer 2

If we consider the specification of Collection.toArray(T[]) :

[…] If the collection fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the runtime type of the specified array and the size of this collection.

However there are some issues that would prevent doing the same with streams:

an implementation cannot know in advance whether a stream of unknown size will fit
with parallel streams, splitting a stream ( Spliterator ) of known size (ie SIZED ) could lead to 2 spliterators of unknown size if the original spliterator is not also SUBSIZED , so you wouldn't know where to put the data after splitting

in both cases, an implementation would still have to create new arrays and finish by copying the data, defeating the purpose of the above requirement.

As in a lot of scenarios you are working with streams of unknown size (a simple filter() or flatMap() would remove that property), you will often fall into the above limitations.

Moreover, even for the known size case, people were often allocating a new array of the right size at the time of calling Collection.toArray(T[]) . This was actually counter-productive in more recent versions of the JVM, so it would be a bad thing to bring the same issue in the Stream API.

Finally, creating a new array based on an existing one can only be done through reflection, which adds its own overhead.

In the end, if we remove the requirement to fill the provided array, there does not seem to be much benefit left over the toArray(IntFunction<T[]>) version.

Why does the Java Stream API omit the 'toArray(T[] a)' overload for writing to an existing array?

Question

2 answers

solution1
0 2022-01-27 20:25:20

solution2
0 2022-01-27 22:21:38

Why does the Java Stream API omit the 'toArray(T[] a)' overload for writing to an existing array?

Question

2 answers

solution1 0 2022-01-27 20:25:20

solution2 0 2022-01-27 22:21:38

solution1
0 2022-01-27 20:25:20

solution2
0 2022-01-27 22:21:38