在 dataframe 的特定列上的 sum()

Question

I cannot work out how to add a new row at the end.我无法弄清楚如何在最后添加新行。 The last row needs to do sum() on specific columns and dividing 2 other columns.最后一行需要对特定列执行 sum() 并划分其他 2 列。 While the DF has applied a filter to sum only specific rows.虽然 DF 已应用过滤器来仅对特定行求和。

df:东风：

    Categ    CategID    col3      col4      col5      col6
0   Cat1     1          -65.90    -100.40   -26.91    23.79
1   Cat2     2          -81.91    -15.30    -16.00    10.06
2   Cat3     3          -57.70    -18.62      0.00    0.00

I would like the output to be like so:我希望 output 像这样：

3   Total              -123.60   -119.02    -26.91    100*(-119.02/-26.91)

col3,col4,col5 would have sum(), and col6 would be the above formula. col3,col4,col5 将具有 sum()，而 col6 将是上述公式。

If [CategID]==2, then don't include in the TOTAL如果 [CategID]==2，则不包括在 TOTAL 中

I was able to get it almost as I wanted by using.query(), like so:通过使用.query()，我几乎可以得到它，就像这样：

#tg is a list #tg 是一个列表

df.loc['Total'] = df.query("categID in @tg").sum()

But with the above I cannot have the 'col6' like this 100*(col4.sum() / col5.sum()) , because they are all sum().但是有了上面我不能有像这样的 'col6' 100*(col4.sum() / col5.sum()) ，因为它们都是 sum() 。

Then I tried with Series like so, but I don't understand how to apply filter.where()然后我尝试了这样的系列，但我不明白如何应用 filter.where()

s = pd.Series(  [df['col3'].sum()\
                ,df['col4'].sum()\
                ,df['col5'].sum()\
                ,100*(df['col4'].sum()/df['col5'].sum())\
                ,index = ['col3','col4','col5','col6'])
df.loc['Total'] = s.where('tag1' in tg)

using the above Series() works, until I add.where() this gives the error: ValueError: Array conditional must be same shape as self使用上面的 Series() 有效，直到我 add.where() 这给出了错误： ValueError: Array conditional must be same shape as self

So, can I accomplish this with the first method, using.query(), just somehow modify one of the column in TOTAL?那么，我是否可以使用第一种方法 using.query() 来完成此操作，只是以某种方式修改 TOTAL 中的一个列？ Otherwise what am I doing wrong in the second method.where()否则我在第二种方法中做错了什么。 where()

Thanks谢谢

Answer 1

IIUC, you can try: IIUC，你可以试试：

s = df.mask(df['CategID'].eq(2)).drop("CategID",1).sum()
s.loc['col6'] = 100*(s['col4'] / s['col5'])
df.loc[len(df)] = s

df = df.fillna({'Categ':'Total',"CategID":''})

print(df)

   Categ CategID    col3    col4   col5        col6
0   Cat1       1  -65.90 -100.40 -26.91   23.790000
1   Cat2       2  -81.91  -15.30 -16.00   10.060000
2   Cat3       3  -57.70  -18.62   0.00    0.000000
3  Total         -123.60 -119.02 -26.91  442.289112

在 dataframe 的特定列上的 sum()

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-03-20 19:37:37

在 dataframe 的特定列上的 sum()

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-03-20 19:37:37

解决方案1
2 已采纳 2021-03-20 19:37:37