[英]What's the numpy/pythonic way to non-destructively replace a column x of a matrix with f(x)?
I have a matrix M
: 我有一个矩阵M
:
import numpy
M = numpy.array([[1,2,3], [4,5,6]])
I want a function that returns an array with every entry x
of the col
th column of M
replaced by f(x)
, but doesn't modify the input matrix. 我想要一个函数,该函数返回一个数组,其中M
的第col
列的每个条目x
都由f(x)
替换,但不修改输入矩阵。
I'm doing this with: 我这样做是:
def my_func(M, col, f=lambda x: x+1):
copy = numpy.copy(M)
copy.T[col] = [f(x[col]) for x in copy]
return copy
which works out well: 效果很好:
>>> my_func([[1,2,3], [4,5,6]], 1)
array([[1, 3, 3],
[4, 6, 6]])
But I'm seeing that this code is a huge bottleneck in my program, since my matrix M is large. 但是我发现该代码是程序中的一个巨大瓶颈,因为矩阵M非常大。 Is there a faster way to do this? 有更快的方法吗?
I've also tried 我也尝试过
numpy.fromiter(map(lambda x: f(x), M.T[col]), dtype=float)
but this doesn't seem to yield a whole lot of speed-up. 但这似乎并未带来很多提速。
(In reality, my matrix M
is actually a numpy.masked_array
, and f
is more complex than just adding 1, but I don't know if those details make any difference.) (实际上,我的矩阵M
实际上是一个numpy.masked_array
,而f
比仅添加1更为复杂,但我不知道这些细节是否有任何区别。)
The following should help the siuation. 以下应有助于进行说明。 Using %timeit
in ipython, I got 100000 loops, best of 3: 14.7 us per loop
for your function and 100000 loops, best of 3: 9.83 us per loop
for the one listed below. 在ipython中使用%timeit
,我得到了100000 loops, best of 3: 14.7 us per loop
对于你的函数, 100000 loops, best of 3: 14.7 us per loop
对于下面列出的一个100000 loops, best of 3: 9.83 us per loop
。
import numpy
M = numpy.array([[1,2,3], [4,5,6]])
def my_func(M, col, f=lambda x: x+1):
copy = numpy.copy(M)
copy[:, col] = f(copy[:, col])
return copy
print my_func(M, 1)
The one issue with my version is that f
has to work on vector input. 我的版本的一个问题是f
必须在向量输入上工作。 However, using vecor_func = numpy.vectorize(func)
this can be done for any function, but it might efect the time boost of my method. 但是,使用vecor_func = numpy.vectorize(func)
可以对任何函数进行此操作,但这可能会影响我的方法的时间提升。
Another cool thing is that this can be done for more complicated indexing/slices. 另一个很酷的事情是,可以为更复杂的索引/切片执行此操作。
def my_func2(M, index, f=lambda x: x+1):
copy = numpy.copy(M)
copy[index] = f(copy[index])
return copy
# prints the same result as before
print my_func2(M, (slice(None), 1))
Can you just apply your function to the whole column at once? 您可以一次将函数应用于整个列吗?
def my_func(M, col, f=lambda x: x+1):
copy = numpy.copy(M)
copy[:,col] = f(M[:,col])
return copy
Doing a matrix-scale operation is going to be faster than iterating through element by element. 进行矩阵级运算将比逐个元素迭代要快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.