简体   繁体   中英

Dask.array conversion of a pointwise array operation

Is it feasible to "daskify" a pointwise function written in terms of general numpy operations ?

Case example + Partial Solution:
For example, see here : https://github.com/SciTools/iris/pull/2964

The point of that is that we want to apply a generalised array operation from another library, but it can only operate on actual numpy arrays.
Whereas we want it to operate on an existing dask array in this operation, and produce a lazy result for which sub-arrays can be efficiently computed.
Which is why it is using da.from_array ...

Alternatives:
You could instead use deferred, but if you do it must evaluate the whole of the argument every time, even if the result is sub-indexed.

Or you could use frompyfunc http://dask.pydata.org/en/latest/array-api.html#dask.array.frompyfunc
But that uses a scalar function, not an array function.
Which is inefficient, especially as it returns an array of objects rather than numbers.

Remaining Problem:
In the above partial solution, the missing piece is the ability to "see through" the opaque point-calculation wrapper, so its dask arguments are visible to the whole graph.
? Perhaps there is a way in Dask to expose the dask_array argument which is currently hidden in this from_array(ArraylikeWrapper(dask_array)) construction ?

Have you tried da.map_blocks ?

x = x.map_blocks(func)

Dask also supports NumPy ufuncs with the __array_ufunc__ protocol if you're able to create those (though map_blocks is likely easier).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM