[英]Pandas dataframe - Sum columns up to a pre given value, return index
I working with pandas dataframe, and cant figure out this problem:我使用熊猫数据框,但无法弄清楚这个问题:
I think I may need some for loops, but I am stuck in this one!我想我可能需要一些 for 循环,但我被困在这个循环中!
If the sum from bottom and up in column A is 28, i want to return the index where the sum is 28. In this example it will be 10+7+11 = 28, and the index(Date) is 5. So i want to return 5.如果 A 列中自下而上的总和为 28,我想返回总和为 28 的索引。在此示例中,它将是 10+7+11 = 28,并且索引(日期)为 5。所以我想退货 5.
Date__A日期__A
0_____11 0_____11
1_____9 1_____9
2_____10 2_____10
3_____8 3_____8
4_____2 4_____2
5_____11 5_____11
6_____7 6_____7
7_____10 7_____10
Using the following df:使用以下 df:
df = pd.DataFrame({'Date':[0, 1, 2, 3, 4, 5, 6, 7],
'A':[11, 9, 10, 8, 2, 11, 7, 10]})
df = df.set_index('Date')
You can find the backwards cumulative sum, by reversing the dataframe.您可以通过反转数据框找到反向累积总和。 You can then reverse this list, and add it as another column to your original dataframe:
然后,您可以反转此列表,并将其作为另一列添加到原始数据框中:
cumsum = df[::-1].cumsum()['A'].to_list()
cumsum.reverse()
df['cumsum'] = cumsum
Then you can get the first index of the subset of the df where the cumsum is <=28 (this will return the closest index where the sum is <28 if it doesn't add exactly to 28).然后你可以得到 df 子集的第一个索引,其中 cumsum <=28(如果它没有精确地添加到 28,这将返回最接近的索引,其中和 <28)。
index = df.loc[df['cumsum'] <= 28].first_valid_index()
Use:用:
import pandas as pd
# setup
df = pd.DataFrame.from_dict({'Date': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7},
'A': {0: 11, 1: 9, 2: 10, 3: 8, 4: 2, 5: 11, 6: 7, 7: 10}})
res = df.iloc[::-1, 1].cumsum().eq(28).idxmax()
print(res)
Output输出
5
Start from computing a temporary Series :从计算临时Series 开始:
wrk = df.set_index('Date').A
To compute the index of your "wanted" element, counting from the top, run:要计算“想要”元素的索引,从顶部开始计数,运行:
res = wrk[wrk.cumsum() == 28]
iFirst = res.index[0] if res.size > 0 else np.nan
To compute the index counting from the bottom, you should compute the cummsum also from the bottom:要从底部计算索引,您还应该从底部计算cummsum :
res = wrk[wrk[::-1].cumsum() == 28]
iLast = res.index[-1] if res.size > 0 else np.nan
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.