简体   繁体   English

多维数组上的 Python lambda 应用程序歧义

[英]Python lambda application ambiguity on multi-dimensional arrays

Given the code example below, one produces an expected result and the other gives an error.给定下面的代码示例,一个产生预期结果,另一个给出错误。 Seems confusing for a beginner (ie me).对于初学者(即我)来说似乎很困惑。 I assume the arithmetic operations work element wise but others don't.我假设算术运算是明智的,但其他人则不然。 What's a "good" (ie efficient) generalize way to simply perform operations on elements of a multi-dimensional array without having some underlying knowledge of the array behavior?什么是“好的”(即有效的)概括方法来简单地对多维数组的元素执行操作,而无需了解数组行为的一些基本知识?

import numpy as np

data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(data)

my_function = lambda x: x*2+5

result = my_function(data)
print(result)

Output: [[1 2 3 4] [5 6 7 8]] [[ 7 9 11 13] [15 17 19 21]]输出:[[1 2 3 4] [5 6 7 8]] [[ 7 9 11 13] [15 17 19 21]]

import numpy as np

data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(data)

my_function = lambda x: x if x < 3 else 0

result = my_function(data)
print(result)

ValueError: The truth value of an array with more than one element is ambiguous. ValueError:具有多个元素的数组的真值不明确。 Use a.any() or a.all()使用 a.any() 或 a.all()

Edit: I am not looking for a particular solution.编辑:我不是在寻找特定的解决方案。 Yes, I can use np.where or some other mechanisms for this exact example.是的,对于这个确切的示例,我可以使用 np.where 或其他一些机制。 I am asking about lambdas in particular and how their use seems ambiguous to the user.我特别询问 lambdas 以及它们的使用对用户来说似乎模棱两可。 If it helps, the lamba / filter is coming from command line/outside of module.如果有帮助,lamba / 过滤器来自命令行/模块外部。 So it can be anything the user wants to transform the original array to - easy as square elements, or call an API and then use its output to determine the replacement value.因此,它可以是用户想要将原始数组转换为的任何东西——简单的正方形元素,或者调用 API,然后使用它的输出来确定替换值。 You get the idea.你明白了。

Running python 3.9.13运行 python 3.9.13

This works because operators like * and + work element-wise for numpy arrays:这是有效的,因为像*+这样的运算符对 numpy 数组按元素工作:

In [101]: data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
     ...: print(data)
     ...: 
     ...: my_function = lambda x: x*2+5
     ...: 
     ...: result = my_function(data)
[[1 2 3 4]
 [5 6 7 8]]

my_function = lambda x: x if x < 3 else 0 fails because if x<3 is inherently a scalar operation. my_function = lambda x: x if x < 3 else 0失败,因为if x<3本质上是一个标量操作。 if/else does not iterate; if/else不迭代; it expects a simple True/False value它需要一个简单的 True/False 值

In [103]: data<3
Out[103]: 
array([[ True,  True, False, False],
       [False, False, False, False]])

np.vectorize is the most general tool for applying an array (or arrays) element-wise to a scalar function: np.vectorize是将数组(或数组)元素应用于标量函数的最通用工具:

In [104]: f = np.vectorize(my_function, otypes=[int])

In [105]: f(data)
Out[105]: 
array([[1, 2, 0, 0],
       [0, 0, 0, 0]])

I included the otypes parameter to avoid one of the more common vectorize faults that SO ask about.我包含了otypes参数以避免 SO 询问的更常见的vectorize错误之一。

np.vectorize is slower than plain iteration for small cases, but becomes competative with large ones. np.vectorize在小情况下比普通迭代慢,但在大情况下变得有竞争力。 But it's main advantage is that it's simpler to use for multidimensional arrays.但它的主要优点是它更易于用于多维数组。 It's even better when the function takes several inputs, and you want to take advantage of broadcasting.当函数需要多个输入并且您想利用广播时,情况会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM