在pandas中添加新列，这是另一列的值的总和

Question

So I'm using pandas and trying to add a new column in called 'Total' where its the sum of all the numbers of vehicles for that year. 因此，我正在使用熊猫，并尝试在“总计”中添加新列，该列是该年所有车辆总数的总和。

From this: 由此：

    type            year     number

Private cars        2005    401638
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

To something like this: 对于这样的事情：

 type            year       number       Total

Private cars        2005    401638      554019
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

Answer 1

Using GroupBy + transform with sum : 使用GroupBy +和sum transform ：

df['Year_Total'] = df.groupby('year')['number'].transform('sum')

Note this will give you the yearly total for each row. 请注意，这将为您提供每一行的年度总计。 If you wish to "blank out" totals for certain rows, you should specify precisely the logic for this. 如果希望某些行的总计“空白”，则应为此精确指定逻辑。

Answer 2

Use GroupBy.transform and then if necessary replace duplicated values: 使用GroupBy.transform ，然后在必要时替换重复的值：

df['Total'] = df.groupby('year')['number'].transform('sum')
print (df)
            type  year  number  Total
0   Private cars  2005       1      3
1    Motorcycles  2005       2      3
2  Off peak cars  2006       5     20
3    Motorcycles  2006       7     20
4   Motorcycles1  2006       8     20

df.loc[df['year'].duplicated(), 'Total'] = np.nan
print (df)
            type  year  number  Total
0   Private cars  2005       1    3.0
1    Motorcycles  2005       2    NaN
2  Off peak cars  2006       5   20.0
3    Motorcycles  2006       7    NaN
4   Motorcycles1  2006       8    NaN

Replacing to empty values is possible, but not recommended, because get mixed values numeric with strings and some function should failed: 可以替换为空值，但不建议这样做，因为获取带字符串的混合值数字和某些函数应该失败：

df.loc[df['year'].duplicated(), 'Total'] = ''
print (df)
            type  year  number Total
0   Private cars  2005       1     3
1    Motorcycles  2005       2      
2  Off peak cars  2006       5    20
3    Motorcycles  2006       7      
4   Motorcycles1  2006       8

Answer 3

This gives a similar dataframe: 这给出了类似的数据框：

total = df['numer'].sum()
df['Total'] = np.ones_line(df['number'].values) * total

在pandas中添加新列，这是另一列的值的总和

问题描述

3 个解决方案

解决方案1
2 2018-09-10 09:21:53

解决方案2
2 2018-09-10 09:22:16

解决方案3
0 2018-09-10 09:22:32

在pandas中添加新列，这是另一列的值的总和

问题描述

3 个解决方案

解决方案1 2 2018-09-10 09:21:53

解决方案2 2 2018-09-10 09:22:16

解决方案3 0 2018-09-10 09:22:32

解决方案1
2 2018-09-10 09:21:53

解决方案2
2 2018-09-10 09:22:16

解决方案3
0 2018-09-10 09:22:32