scipy.sparse矩阵的索引操作的矢量化

Question

The following code runs too slowly even though everything seems to be vectorized. 即使一切似乎都是矢量化的，下面的代码运行得太慢了。

from numpy import *
from scipy.sparse import *

n = 100000;
i = xrange(n); j = xrange(n);
data = ones(n);

A=csr_matrix((data,(i,j)));

x = A[i,j]

The problem seems to be that the indexing operation is implemented as a python function, and invoking A[i,j] results in the following profiling output 问题似乎是索引操作是作为python函数实现的，并且调用A[i,j]导致以下分析输出

         500033 function calls in 8.718 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   100000    7.933    0.000    8.156    0.000 csr.py:265(_get_single_element)
        1    0.271    0.271    8.705    8.705 csr.py:177(__getitem__)
(...)

Namely, the python function _get_single_element gets called 100000 times which is really inefficient. 也就是说，python函数_get_single_element被调用100000次，这实在是效率低下。 Why isn't this implemented in pure C? 为什么不在纯C中实现？ Does anybody know of a way of getting around this limitation, and speeding up the above code? 有没有人知道解决这个限制的方法，并加快上述代码？ Should I be using a different sparse matrix type? 我应该使用不同的稀疏矩阵类型吗？

Answer 1

You can use A.diagonal() to retrieve the diagonal much more quickly (0.0009 seconds vs. 3.8 seconds on my machine) . 您可以使用A.diagonal()更快地检索对角线（0.0009秒与我机器上的3.8秒）。 However, if you want to do arbitary indexing then that is a more complicated question because you aren't using slices so much as a list of indices. 但是，如果您想进行仲裁索引，那么这是一个更复杂的问题，因为您没有使用切片作为索引列表。 The _get_single_element function is being called 100000 times because it just iterating through the iterators (i and j) that you passed to it. _get_single_element函数被调用100000次，因为它只是遍历传递给它的迭代器（i和j）。 A slice would be A[30:60,10] or something similar to that. 切片将是A [30：60,10]或类似的东西。

Also, I would use csr_matrix(eye(n,n)) to make the same matrix that you made with iterators just for simplicity. 另外，为了简单起见，我将使用csr_matrix(eye(n,n))制作与迭代器相同的矩阵。

Update: 更新：

Ok, since your question is truly about being able to access lots of random elements quickly, I will answer your questions as best as I can. 好的，既然你的问题确实是能够快速访问大量随机元素，我会尽我所能回答你的问题。

Why isn't this implemented in pure C? 为什么不在纯C中实现？

The answer is simple: no one has gotten around to it. 答案很简单：没人接触过它。 There is still plenty of work to be done in the sparse matrix modules area of Scipy from what I have seen. 从我所看到的Scipy稀疏矩阵模块领域仍有大量工作要做。 One part that is implemented in C is the conversions between different sparse matrix formats. 在C中实现的一部分是不同稀疏矩阵格式之间的转换。

Does anybody know of a way of getting around this limitation, and speeding up the above code? 有没有人知道解决这个限制的方法，并加快上述代码？

You can try actually diving into the sparse matrix modules and trying to speed them up. 您可以尝试实际潜入稀疏矩阵模块并尝试加速它们。 I did so and was able to get the time down to less than a third of the original when trying out your code above for random accesses using csr matrices. 我这样做了，并且当使用csr矩阵尝试使用上面的代码进行随机访问时，能够将时间缩短到不到原来的三分之一。 I had to directly access _get_single_element and pare down the code significantly to do that including taking out bound checks. 我必须直接访问_get_single_element并显着减少代码，包括取出绑定检查。

However, it was faster still to use a lil_matrix (though slower to initialize the matrix), but I had to do the accessing with a list comprehension because lil matrices aren't setup for the type of indexing you are doing. 然而，使用lil_matrix仍然更快（尽管初始化矩阵的速度较慢），但我不得不使用列表解析进行访问，因为没有为您正在进行的索引类型设置lil矩阵。 Using a list comprehension for the csr_matrix still leaves the lil matrix method way ahead by the way. 使用csr_matrix的列表推导仍然使lil矩阵方法顺便前进。 Ultimately, the lil matrix is faster for accessing random elements because it isn't compressed. 最终，lil矩阵更快地访问随机元素，因为它没有被压缩。

Using the lil_matrix in its original form runs in about a fifth of the time of the code you have listed above. 使用原始形式的lil_matrix运行的时间大约是上面列出的代码的五分之一。 If I take out some bound checks and call lil_matrix's _get1() method directly, I can bring the time down further about 7% of the original time. 如果我取出一些绑定的检查并直接调用lil_matrix的_get1（）方法，我可以将时间进一步缩短约原始时间的7％。 For clarity that's a speed-up from 3.4-3.8 seconds to about 0.261 seconds. 为了清楚起见，速度从3.4-3.8秒增加到大约0.261秒。

Lastly, I tried making my own function that directly accesses the lil matrix's data and avoids the repeated function calls. 最后，我尝试创建自己的函数，直接访问lil矩阵的数据，避免重复的函数调用。 The time for this was about 0.136 seconds. 这个时间约为0.136秒。 This didn't take advantage of the data being sorted which is another potential optimization (in particular if you are accessing a lot of elements that are on the same rows). 这没有利用被排序的数据，这是另一种潜在的优化（特别是如果您访问同一行上的许多元素）。

If you want faster than that then you will have to write your own C code sparse matrix implementation probably. 如果你想要比那更快，你可能必须编写自己的C代码稀疏矩阵实现。

Should I be using a different sparse matrix type? 我应该使用不同的稀疏矩阵类型吗？

Well, I suggest the lil matrix if your intent is to access a whole lot of elements, but it all depends on what you need to do. 好吧，如果你打算访问很多元素，我建议使用lil矩阵，但这完全取决于你需要做什么。 Do you also need to multiply matrices for instance? 例如，你还需要乘以矩阵吗？ Just remember that changing between matrices can at least sometimes (in certain circumstances) be quite fast so don't rule out changing to a different matrix format to do different operations. 请记住，矩阵之间的更改至少有时（在某些情况下）非常快，因此不排除更改为不同的矩阵格式以执行不同的操作。

If you don't need to do actually do any algebra operations on your matrix, then maybe you should just use a defaultdict or something similar. 如果您不需要在矩阵上进行任何代数运算，那么也许您应该使用defaultdict或类似的东西。 The danger with defaultdicts is that whenever an element is asked for that isn't in the dict, it sets that item to the default and stores it so that could be problematic. defaultdicts的危险在于，每当要求一个元素不在dict中时，它会将该项设置为默认值并存储它以便可能出现问题。

Answer 2

I think _get_single_element is only called when the default dtype of 'object' is used. 我认为只有在使用'object'的默认dtype时才会调用_get_single_element。 Have you tried providing a dtype, such as csr_matrix((data, (i,j)), dtype=int32) 您是否尝试过提供csr_matrix((data, (i,j)), dtype=int32) ，例如csr_matrix((data, (i,j)), dtype=int32)

scipy.sparse矩阵的索引操作的矢量化

问题描述

2 个解决方案

解决方案1
7 已采纳 2010-03-08 21:14:23

Update: 更新：

解决方案2
0 2010-03-08 21:30:17

scipy.sparse矩阵的索引操作的矢量化

问题描述

2 个解决方案

解决方案1 7 已采纳 2010-03-08 21:14:23

Update: 更新：

解决方案2 0 2010-03-08 21:30:17

解决方案1
7 已采纳 2010-03-08 21:14:23

解决方案2
0 2010-03-08 21:30:17