SQL groupby在熊猫中划分

Question

and now I have such a SQL statement that I would like to know how I can write that in Pandas, maybe using groupBy and apply?: 现在我有一条SQL语句，我想知道如何在Pandas中编写该语句，也许使用groupBy并应用？

Give a table with columns of A, B 给出带有A，B列的表格

 Select A, sum(B) / sum(A)
 from table
 group by A;

I am now at 我现在在

def func(group):
   x = group['B']
   y = group['A']
   return x.sum() / y.sum()

table.groupby('A').apply(func)

This will generate a sequence of numbers without Column A which is used for grouping by on. 这将生成一个不带列A的数字序列，该列用于按on进行分组。 I would like to have a dataframe as output with A as a separate column also, just like the SQL statement I wrote. 我也希望有一个数据框作为输出，并且A作为单独的列，就像我编写的SQL语句一样。 Can anyone help me to answer this question? 谁能帮我回答这个问题？

Thanks! 谢谢！

Answer 1

Is this what you want ? 这是你想要的吗？

df=pd.DataFrame({'A':[1,1,3,4],'B':[2,3,4,5]})

def func(group):
   x = group['B']
   y = group['A']
   return x.sum() / y.sum()

df.groupby('A').apply(func).reset_index()


Out[934]: 
   A         0
0  1  2.500000
1  3  1.333333
2  4  1.250000

Answer 2

There's no need for an apply here. 这里不需要apply 。 It would be a lot faster to groupby , calculate the sum and divide directly, as pandas vectorises these operations. 由于大熊猫将这些操作向量化，因此groupby ，直接计算总和并除以更快。

Borrowing from @Wen's setup, this is how I'd do it - 从@Wen的设置中借用，这就是我的做法-

v = df.groupby('A')[['A', 'B']].sum()
v['B'] /= v['A']
del v['A']

          B
A          
1  2.500000
3  1.333333
4  1.250000

SQL groupby在熊猫中划分

问题描述

2 个解决方案

解决方案1
3 2018-01-22 02:50:12

解决方案2
3 2018-01-22 02:55:13

SQL groupby在熊猫中划分

问题描述

2 个解决方案

解决方案1 3 2018-01-22 02:50:12

解决方案2 3 2018-01-22 02:55:13

解决方案1
3 2018-01-22 02:50:12

解决方案2
3 2018-01-22 02:55:13