[英]Efficient nested looping with pandas dataframe
I have a simple panda dataframe like this one: 我有一个像这样的简单熊猫数据框:
d = {'col1': ['a','b','c','d','e'], 'col2': [1,2,3,4,5]}
df = pd.DataFrame(d)
df
col1 col2
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
And I would need to iterate over it and to get a simple arithmetic results (like a product or so) for all combination of row values. 我需要迭代它并获得所有行值组合的简单算术结果(如产品等)。 I was thinking to make a matrix and put the values in, like this:
我正在考虑制作一个矩阵并将值放入,如下所示:
size = df.shape[0]
mtx = np.zeros(shape=(size, size))
mtx
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
But I somehow 'sense' there is more efficient way to do this than nested looping, like this: 但我有点'感觉'有更有效的方法来做这个比嵌套循环,像这样:
for index1, c11, c12, in df.itertuples():
for index2, c21, c22 in df.itertuples():
mtx[index1][index2] = float(c12) * float(c22)
mtx
array([[ 1., 2., 3., 4., 5.],
[ 2., 4., 6., 8., 10.],
[ 3., 6., 9., 12., 15.],
[ 4., 8., 12., 16., 20.],
[ 5., 10., 15., 20., 25.]])
Any idea will be much appreciated! 任何想法将不胜感激! Thanks!
谢谢!
For oprations like *,+,-,/
you can do the following: (this example is for *
, but you can just change the operation in the last row if you want +,-
or /
) 对于
*,+,-,/
您可以执行以下操作:(此示例适用于*
,但如果您想要+,-
或/
+,-
则可以更改最后一行中的操作)
import numpy as np
import pandas as pd
d = {'col1': ['a','b','c','d','e'], 'col2': [1,2,3,4,5]}
df = pd.DataFrame(d)
a=np.array([df.col2.tolist()])
a.T*a
The result is: 结果是:
array([[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]], dtype=int64)
Change aT*a
to a.T+a
for pairwise sum, and to aT-a
for pairwise difference. 将
aT*a
更改为a.T+a
以获得成对和,并将aT-a
更改为成对差。 If you want pairwise division, you can change it into aT/a
, but remember to include the line a=a.astype(float)
above the operation. 如果你想要成对分割,你可以把它改成
aT/a
,但是记得在操作上面加上a=a.astype(float)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.