遍歷所有數據幀列

Question

我想比較2個給定數據幀的所有行

如何優化以下代碼以動態迭代給定pandas數據幀的所有列？

df1,df2 = pd.read_csv(...)

for index2, row2 in df2.iterrows():
    for index1, row1 in df1.iterrows():
        if row1[0]==row2[0]: i = i+1
        if row1[1]==row2[1]: i = i+1
        if row1[2]==row2[2]: i = i+1
        if row1[3]==row2[3]: i = i+1
        print("# same values: "+str(i))
        i = 0

Answer 1

IIUC您需要檢查一個數據幀的整行是否等於另一個數據幀。 您可以比較兩個數據幀的相等性，然后使用axis=1 all方法來檢查行，然后對結果求和：

df1 = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [2, 3, 4, 5, 6]})
df2 = pd.DataFrame({'a': [1, 5, 3, 7, 5], 'b': [2, 3, 8, 5, 6]})

In [1531]: df1 == df2
Out[1531]: 
       a      b
0   True   True
1  False   True
2   True  False
3  False   True
4   True   True

In [1532]: (df1 == df2).all(axis=1)
Out[1532]: 
0     True
1    False
2    False
3    False
4     True
dtype: bool

In [1533]: (df1 == df2).all(axis=1).sum()
Out[1533]: 2

result = (df1 == df2).all(axis=1).sum()

In [1535]: print("# same values: "+str(result))
# same values: 2

Answer 2

您的嵌套for循環暗示你都比較rows的第一個DataFrame的所有rows的第二個DataFrame ，並計算的情況下，在相應的列匹配的值。

如果是這樣，你可以依靠numpy廣播來sum相等的情況下為每row在df1相對於所有rows的df2 ，然后sum這些所有rows的df1拿到總像這樣：

df1.apply(lambda x: np.sum(df2.values == x.values), axis=1)

為了說明，兩個隨機抽樣的DataFrames ：

df1 = pd.DataFrame(np.random.randint(1, 5, (10, 2)))

   0  1
0  2  4
1  2  3
2  4  1
3  3  3
4  3  3
5  4  4
6  2  4
7  3  4
8  3  4
9  4  1

df2 = pd.DataFrame(np.random.randint(1, 5, (10, 2)))

   0  1
0  3  2
1  3  4
2  4  4
3  2  3
4  4  3
5  4  1
6  4  1
7  3  4
8  3  1
9  1  4

在將每個df2 rows與所有df2 rows進行比較后，獲取所有df1 rows的相等值的sum ：

df1.apply(lambda x: np.sum(df2.values == x.values), axis=1)

0    5
1    3
2    7
3    6
4    6
5    8
6    5
7    8
8    8
9    7

然后你可以對案例進行總結，或者一次性完成：

df1.apply(lambda x: np.sum(df2.values == x.values), axis=1).sum()

63

遍歷所有數據幀列

問題描述

2 個解決方案

解決方案1
2 2016-01-06 22:02:22

解決方案2
1 已采納 2016-01-06 23:20:09

遍歷所有數據幀列

問題描述

2 個解決方案

解決方案1 2 2016-01-06 22:02:22

解決方案2 1 已采納 2016-01-06 23:20:09

解決方案1
2 2016-01-06 22:02:22

解決方案2
1 已采納 2016-01-06 23:20:09