简体   繁体   English

Pandas 数据框 - 将列相加到预先给定的值,返回索引

[英]Pandas dataframe - Sum columns up to a pre given value, return index

I working with pandas dataframe, and cant figure out this problem:我使用熊猫数据框,但无法弄清楚这个问题:

I think I may need some for loops, but I am stuck in this one!我想我可能需要一些 for 循环,但我被困在这个循环中!

If the sum from bottom and up in column A is 28, i want to return the index where the sum is 28. In this example it will be 10+7+11 = 28, and the index(Date) is 5. So i want to return 5.如果 A 列中自下而上的总和为 28,我想返回总和为 28 的索引。在此示例中,它将是 10+7+11 = 28,并且索引(日期)为 5。所以我想退货 5.

Date__A日期__A
0_____11 0_____11
1_____9 1_____9
2_____10 2_____10
3_____8 3_____8
4_____2 4_____2
5_____11 5_____11
6_____7 6_____7
7_____10 7_____10

Using the following df:使用以下 df:

 df = pd.DataFrame({'Date':[0, 1, 2, 3, 4, 5, 6, 7],
                       'A':[11, 9, 10, 8, 2, 11, 7, 10]})
    
 df = df.set_index('Date')

You can find the backwards cumulative sum, by reversing the dataframe.您可以通过反转数据框找到反向累积总和。 You can then reverse this list, and add it as another column to your original dataframe:然后,您可以反转此列表,并将其作为另一列添加到原始数据框中:

cumsum = df[::-1].cumsum()['A'].to_list()
cumsum.reverse()
df['cumsum'] = cumsum

Then you can get the first index of the subset of the df where the cumsum is <=28 (this will return the closest index where the sum is <28 if it doesn't add exactly to 28).然后你可以得到 df 子集的第一个索引,其中 cumsum <=28(如果它没有精确地添加到 28,这将返回最接近的索引,其中和 <28)。

 index = df.loc[df['cumsum'] <= 28].first_valid_index()

Use:用:

import pandas as pd

# setup
df = pd.DataFrame.from_dict({'Date': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7},
                             'A': {0: 11, 1: 9, 2: 10, 3: 8, 4: 2, 5: 11, 6: 7, 7: 10}})

res = df.iloc[::-1, 1].cumsum().eq(28).idxmax()
print(res)

Output输出

5

Start from computing a temporary Series :从计算临时Series 开始

wrk = df.set_index('Date').A

To compute the index of your "wanted" element, counting from the top, run:要计算“想要”元素的索引,从顶部开始计数,运行:

res = wrk[wrk.cumsum() == 28]
iFirst = res.index[0] if res.size > 0 else np.nan

To compute the index counting from the bottom, you should compute the cummsum also from the bottom:要从底部计算索引,您还应该从底部计算cummsum

res = wrk[wrk[::-1].cumsum() == 28]
iLast = res.index[-1] if res.size > 0 else np.nan

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM