简体   繁体   中英

Generate a numpy array from a python function

I have what I thought would be a simple task in numpy, but I'm having trouble.

I have a function which takes an index in the array and returns the value that belongs at that index. I would like to, efficiently, write the values into a numpy array.

I have found numpy.fromfunction , but it doesn't behave remotely like the documentation suggests. It seems to "vectorise" the function, which means that instead of passing the actual indices it passes a numpy array of indices:

def vsin(i):
    return float(round(A * math.sin((2 * pi * wf) * i)))

numpy.fromfunction(vsin, (len,), dtype=numpy.int16)
# TypeError: only length-1 arrays can be converted to Python scalars

(if we use a debugger to inspect i , it is a numpy.array instance.)

So, if we try to use numpy's vectorised sin function:

def vsin(i):
    return (A * numpy.sin((2 * pi * wf) * i)).astype(numpy.int16)

numpy.fromfunction(vsin, (len,), dtype=numpy.int16)

We don't get a type error, but if len > 2**15 we get discontinuities chopping accross our oscillator, because numpy is using int16_t to represent the index!

The point here isn't about sin in particular: I want to be able to write arbitrary python functions like this (whether a numpy vectorised version exists or not) and be able to run them inside a tight C loop (rather than a roundabout python one), and not have to worry about integer wraparound.

Do I really have to write my own cython extension in order to be able to do this? Doesn't numpy have support for running python functions once per item in an array, with access to the index?

It doesn't have to be a creation function: I can use numpy.empty (or indeed, reuse an existing array from somewhere else.) So a vectorised transformation function would also do.

I think the issue of integer wraparound is unrelated to numpy's vectorized sin implementation and even the use of python or C.

If you use a 2-byte signed integer and try to generate an array of integer values ranging from 0 to above 32767, you will get a wrap-around error. The array will look like:

[0, 1, 2, ... , 32767, -32768, -32767, ...]

The simplest solution, assuming memory is not too tight, is to use more bytes for your integer array generated by fromfunction so you don't have a wrap-around problem in the first place (up to a few billion):

numpy.fromfunction(vsin, (len,), dtype=numpy.int32)

numpy is optimized to work fast on arrays by passing the whole array around between vectorized functions. I think in general the numpy tools are inconvenient for trying to run scalar functions once per array element.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM