简体   繁体   English

尝试使用Python / pandas基于来自另一个数据帧的列的内部总和来创建新的数据帧

[英]Trying to create a new dataframe based on internal sums of a column from another dataframe using Python/pandas

Let's assume I have a pandas dataframe df as follow: 我们假设我有一个pandas数据帧df,如下所示:

df = DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})

    Col1 Col2
0      1      5
1      2      6
2      3      7
3      4      8

Is there a way for me to change a column into the sum of all the following elements in the column? 有没有办法让我将列更改为列中所有以下元素的总和?

For example for 'Col1' the result would be: 例如,对于'Col1',结果将是:

    Col1   Col2
0     10      5
1      9      6
2      7      7
3      4      8

1 becomes 1 + 2 + 3 + 4 = 10 1变为1 + 2 + 3 + 4 = 10
2 becomes 2 + 3 + 4 = 9 2变为2 + 3 + 4 = 9
3 becomes 3 + 4 = 7 3变为3 + 4 = 7
4 remains 4 4仍然是4

If this is possible, is there a way for me to specify a cut off index after which this behavior would take place? 如果这是可能的,有没有办法让我指定一个截止索引,之后会发生这种行为? For example if the cut off index would be the key 1, the result would be: 例如,如果截止索引是键1,结果将是:

    Col1   Col2
0      1      5
1      2      6
2      7      7
3      4      8

I am thinking there is no other way than using loops to do this, but I thought there might be a way using vectorized calculations. 我在想除了使用循环之外别无他法,但我认为可能有一种方法可以使用矢量化计算。

Thanks heaps 谢谢堆

Yes, you could use loop but very cheap one: 是的,你可以使用循环但非常便宜的:

def sum_col(column,start=0):
    l = len(column)
    return [column.values[i:].sum() for i in range(start,l)]

And usage: 用法:

data['Col1'] = sum_col(data['Col1'],0)

Here is one way to avoid loop. 这是一种避免循环的方法。

import pandas as pd

your_df = pd.DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})

def your_func(df, column, cutoff):
    # do cumsum and flip over
    x = df[column][::-1].cumsum()[::-1]
    df[column][df.index > cutoff] = x[x.index > cutoff]     
    return df

# to use it
your_func(your_df, column='Col1', cutoff=1)

Out[68]: 
   Col1  Col2
0     1     5
1     2     6
2     7     7
3     4     8

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用来自另一个数据帧的 if 条件在 Pandas 数据帧中创建一个新列 - create a new column in pandas dataframe using if condition from another dataframe Pandas 根据来自另一个 dataframe 的计数和条件创建新列 - Pandas Create new column based on a count and a condition from another dataframe Pandas 数据框根据另一列的条件创建新行 - Pandas dataframe create new rows based on condition from another column 如何基于另一个DataFrame中的列在Pandas DataFrame中创建新列? - How to create a new column in a Pandas DataFrame based on a column in another DataFrame? Python Pandas Dataframe - 创建新列 - Python Pandas Dataframe - Create new column using a conditional/applying a function based on another column 尝试根据与if语句相关的数据框在熊猫中创建新的数据框列 - Trying to create a new dataframe column in pandas based on a dataframe related if statement 根据来自另一个熊猫数据框的列在熊猫数据框中创建新行 - Create new rows in a Pandas Dataframe based on a column from another pandas dataframe 熊猫:在一个数据框中创建新列,并根据与另一个数据框中的匹配键进行匹配 - Pandas: create new column in one dataframe with values based on matching key from another dataframe 根据日期,使用来自另一个 dataframe 的值在 pandas dataframe 中创建一个新列 - Create a new column in pandas dataframe with values from another dataframe, based on date 根据另一列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM