Python lambda application ambiguity on multi-dimensional arrays

Question

Given the code example below, one produces an expected result and the other gives an error. Seems confusing for a beginner (ie me). I assume the arithmetic operations work element wise but others don't. What's a "good" (ie efficient) generalize way to simply perform operations on elements of a multi-dimensional array without having some underlying knowledge of the array behavior?

import numpy as np

data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(data)

my_function = lambda x: x*2+5

result = my_function(data)
print(result)

Output: [[1 2 3 4] [5 6 7 8]] [[ 7 9 11 13] [15 17 19 21]]

import numpy as np

data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(data)

my_function = lambda x: x if x < 3 else 0

result = my_function(data)
print(result)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Edit: I am not looking for a particular solution. Yes, I can use np.where or some other mechanisms for this exact example. I am asking about lambdas in particular and how their use seems ambiguous to the user. If it helps, the lamba / filter is coming from command line/outside of module. So it can be anything the user wants to transform the original array to - easy as square elements, or call an API and then use its output to determine the replacement value. You get the idea.

Running python 3.9.13

Answer 1

This works because operators like * and + work element-wise for numpy arrays:

In [101]: data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
     ...: print(data)
     ...: 
     ...: my_function = lambda x: x*2+5
     ...: 
     ...: result = my_function(data)
[[1 2 3 4]
 [5 6 7 8]]

my_function = lambda x: x if x < 3 else 0 fails because if x<3 is inherently a scalar operation. if/else does not iterate; it expects a simple True/False value

In [103]: data<3
Out[103]: 
array([[ True,  True, False, False],
       [False, False, False, False]])

np.vectorize is the most general tool for applying an array (or arrays) element-wise to a scalar function:

In [104]: f = np.vectorize(my_function, otypes=[int])

In [105]: f(data)
Out[105]: 
array([[1, 2, 0, 0],
       [0, 0, 0, 0]])

I included the otypes parameter to avoid one of the more common vectorize faults that SO ask about.

np.vectorize is slower than plain iteration for small cases, but becomes competative with large ones. But it's main advantage is that it's simpler to use for multidimensional arrays. It's even better when the function takes several inputs, and you want to take advantage of broadcasting.

Python lambda application ambiguity on multi-dimensional arrays

Question

1 answers

solution1
3 2022-06-16 18:28:45

Python lambda application ambiguity on multi-dimensional arrays

Question

1 answers

solution1 3 2022-06-16 18:28:45

solution1
3 2022-06-16 18:28:45