Python：查找數據框中特定列值總和為 0 的所有行

Question

我想提取數據框中的所有行，其中這些分組行的特定列的總和為 0。

例如：如果我有以下行：

Row 1 1000
Row 2 -1000
Row 3 6000
Row 4 8000

由於列的總和為 0(+1000-1000=0)，我會將 Row1 和 Row 2 分組，我該如何在 python 中做到這一點？ 我如何使用 numpy 來實現這一點？

Answer 1

為了獲得更具指導意義的結果，我將您的示例 DataFrame 擴展為：

   Id  Amount
0   1    1000
1   2   -1000
2   3   -5000
3   4    6000
4   5    8000
5   6   -2000
6   7   -4000
7   8   -2000
8   9    1500
9  10     500

您可以通過以下方式生成“邊界行索引對”：

result = []
# Starting from each row, except the last
for i in range(df.index.size - 1):
    # Compute expanding sum
    s = df.iloc[i:].expanding().Amount.sum()
    # Find indices of zeroes
    ind = s[s == 0].index
    # Append "start == i, end == j" to the result
    result.extend([ [i, j] for j in ind ])

結果是：

[[0, 1], [1, 3], [1, 7], [4, 7], [7, 9]]

要檢索、打印和檢查顯示的行“范圍”，請運行：

for i, j in result:
    print(f'From {i} to {j}:')
    print(df.iloc[i:j+1])
    print(f'Sum: {df.iloc[i:j+1].Amount.sum()}\n')

結果是：

From 0 to 1:
   Id  Amount
0   1    1000
1   2   -1000
Sum: 0

From 1 to 3:
   Id  Amount
1   2   -1000
2   3   -5000
3   4    6000
Sum: 0

From 1 to 7:
   Id  Amount
1   2   -1000
2   3   -5000
3   4    6000
4   5    8000
5   6   -2000
6   7   -4000
7   8   -2000
Sum: 0

From 4 to 7:
   Id  Amount
4   5    8000
5   6   -2000
6   7   -4000
7   8   -2000
Sum: 0

From 7 to 9:
   Id  Amount
7   8   -2000
8   9    1500
9  10     500
Sum: 0

按照截至 12:52Z 的評論進行編輯

如果您只想要“葉級”范圍（不包括在更廣泛的范圍內），那么在找到一些零索引（在滾動總和中）之后，您應該只報告第一個范圍，因為其他范圍只包括已經報告的范圍。

所以代碼應該改為：

result = []
# Starting from each row, except the last
for i in range(df.index.size - 1):
    # Compute expanding sum
    s = df.iloc[i:].expanding().Amount.sum()
    # Find indices of zeroes
    ind = s[s == 0].index   
    if ind.size > 0:        # Something found
        result.append([i, ind[0]])  # Append "from i to the first 'zero row'"

注意：

如果沒有找到“零和”，我添加了if以避免“索引超出范圍”異常，
改延伸到追加，這是因為：
- 在以前的版本中，我希望對列表進行“分解”，並將每對單獨添加到結果中，
- 現在我只添加一對，這不應該被“分解”。

這次的結果是：

[[0, 1], [1, 3], [4, 7], [7, 9]]

並注意范圍[1, 7] （存在於第一個解決方案中）尚未添加。

所以現在你只有不包括其他范圍的范圍。

Python：查找數據框中特定列值總和為 0 的所有行

問題描述

1 個解決方案

解決方案1
0 已采納 2020-08-31 12:04:52

按照截至 12:52Z 的評論進行編輯

Python：查找數據框中特定列值總和為 0 的所有行

問題描述

1 個解決方案

解決方案1 0 已采納 2020-08-31 12:04:52

按照截至 12:52Z 的評論進行編輯

解決方案1
0 已采納 2020-08-31 12:04:52