根据 pandas 数据框中另一列中的条件对一列求和

Question

I just started using Python and I am trying to create programs to help monitor some of my investments.我刚开始使用 Python，我正在尝试创建程序来帮助监控我的一些投资。 Right now I have a definition set up that will give me my current returns based on my initial buy price and the current price.现在我有一个定义，它会根据我的初始购买价格和当前价格给出我当前的回报。 Here is what my data frame looks like:这是我的数据框的样子：

    Ticker   Expiration    Contracts   Call   Buy   Prem 12/22  Prem 12/23  Prem 12/25
0   x          date 1        1         $      0.13   0.15         0.12       0.13
1   y          date 2        1         $      0.33   0.34         0.34       0.39
2   z          date 3        1         $      0.25   NaN          NaN        0.25

I have the current definition written for returns:我有为退货编写的当前定义：

def returns(op):
    """
    Calculates the current return for each options
    """
    totalPrem=op.sum(axis=0,skipna=True)["Prem 12/22":]
    buy=op.sum(axis=0,skipna=True)["Buy"]
    return (totalPrem-buy)*100

This gives me the results by adding all the columns from Prem 12/22 onward and subtracting it from the sum of the Buy column.这通过添加从 Prem 12/22 开始的所有列并从 Buy 列的总和中减去它来给出结果。 My problem is that on 12/22 and 12/23, z was not yet bought.我的问题是在 12/22 和 12/23，z 还没有买。 However, the returns definition sums all of Buy.但是，退货定义对所有 Buy 求和。 So the returns for 12/22 and 12/23 adds the two data points in 12/22 and 12/23 and subtracts them from the 3 data points in Buy.因此，12/22 和 12/23 的回报将 12/22 和 12/23 中的两个数据点相加，并从买入的 3 个数据点中减去它们。 This leads to the result:这导致了结果：

Prem 12/22: -22
Prem 12/23: -25
Prem 12/25: 6

I want to alter my code to where for 12/22 and 12/23, the buy column only adds the first two.我想将我的代码更改为 12/22 和 12/23，购买列只添加前两个。 I was wondering if there was a way to where buy could be calculated by summing the buy column in a way where the data points are only added together if there is no NaN on the row of the data point.我想知道是否有一种方法可以通过以仅在数据点行上没有 NaN 时才将数据点加在一起的方式对购买列求和来计算购买。 The output I am looking for is:我要找的output是：

Prem 12/22: 3
Prem 12/23: 0
Prem 12/25: 6

Thanks!谢谢！

Answer 1

You can use list comprehension to filter for notnull() rows by column and do the calculation per column.您可以使用列表推导式按列过滤notnull()行并按列进行计算。 To only apply to the columns with Prem in them, I create a cols index object so we can dynamically apply changes to those indexed columns:为了仅应用于其中包含Prem的列，我创建了一个cols索引 object 以便我们可以动态地将更改应用于这些索引列：

cols = df.columns[df.columns.str.contains('Prem')]
res = [int(round((df.loc[df[col].notnull(), col].sum() -
                  df.loc[df[col].notnull(), 'Buy'].sum()), 3) 
                  * 100)  for col in cols]
for c,r in zip(cols, res):
    print(f'{c}: {r}')

Prem 12/22: 3
Prem 12/23: 0
Prem 12/25: 6

Answer 2

Your purpose is to calculate return of Buy column.您的目的是计算Buy列的回报。 Suppose your data frame that contain Buy column is called df then you can simply call like the following:假设包含Buy列的数据框称为df ，那么您可以像下面这样简单地调用：

def returnBuy(df):
    tsBuy=df["Buy"]
    return tsBuy.pct_change()

Answer 3

May be this: you could create a boolean column: if NA/NAN After that take the rows without missing data and sum after that:可能是这样的：您可以创建一个 boolean 列：如果 NA/NAN 之后获取没有丢失数据的行，然后求和：

data = {'col_1': [1, 2, 3, 4], 'col_2': [5, 7, pd.NA, 8]}
df = pd.DataFrame.from_dict(data)
df['should_add'] = pd.isna(df["col_2"])
print(df)
sums= df[~df['should_add']].sum(axis=0)
print(sums)

or one line:或一行：

sums= df[~pd.isna(df["col_2"])].sum(axis=0)

根据 pandas 数据框中另一列中的条件对一列求和

问题描述

3 个解决方案

解决方案1
1 2020-12-26 03:48:04

解决方案2
0 2020-12-26 03:37:56

解决方案3
0 2020-12-26 03:41:03

根据 pandas 数据框中另一列中的条件对一列求和

问题描述

3 个解决方案

解决方案1 1 2020-12-26 03:48:04

解决方案2 0 2020-12-26 03:37:56

解决方案3 0 2020-12-26 03:41:03

解决方案1
1 2020-12-26 03:48:04

解决方案2
0 2020-12-26 03:37:56

解决方案3
0 2020-12-26 03:41:03