简体   繁体   English

在pandas中添加新列,这是另一列的值的总和

[英]Adding a new column in pandas which is the total sum of the values of another column

So I'm using pandas and trying to add a new column in called 'Total' where its the sum of all the numbers of vehicles for that year. 因此,我正在使用熊猫,并尝试在“总计”中添加新列,该列是该年所有车辆总数的总和。

From this: 由此:

    type            year     number

Private cars        2005    401638
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

To something like this: 对于这样的事情:

 type            year       number       Total

Private cars        2005    401638      554019
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

Using GroupBy + transform with sum : 使用GroupBy +和sum transform

df['Year_Total'] = df.groupby('year')['number'].transform('sum')

Note this will give you the yearly total for each row. 请注意,这将为您提供每一行的年度总计。 If you wish to "blank out" totals for certain rows, you should specify precisely the logic for this. 如果希望某些行的总计“空白”,则应为此精确指定逻辑。

Use GroupBy.transform and then if necessary replace duplicated values: 使用GroupBy.transform ,然后在必要时替换重复的值:

df['Total'] = df.groupby('year')['number'].transform('sum')
print (df)
            type  year  number  Total
0   Private cars  2005       1      3
1    Motorcycles  2005       2      3
2  Off peak cars  2006       5     20
3    Motorcycles  2006       7     20
4   Motorcycles1  2006       8     20

df.loc[df['year'].duplicated(), 'Total'] = np.nan
print (df)
            type  year  number  Total
0   Private cars  2005       1    3.0
1    Motorcycles  2005       2    NaN
2  Off peak cars  2006       5   20.0
3    Motorcycles  2006       7    NaN
4   Motorcycles1  2006       8    NaN

Replacing to empty values is possible, but not recommended, because get mixed values numeric with strings and some function should failed: 可以替换为空值,但不建议这样做,因为获取带字符串的混合值数字和某些函数应该失败:

df.loc[df['year'].duplicated(), 'Total'] = ''
print (df)
            type  year  number Total
0   Private cars  2005       1     3
1    Motorcycles  2005       2      
2  Off peak cars  2006       5    20
3    Motorcycles  2006       7      
4   Motorcycles1  2006       8      

This gives a similar dataframe: 这给出了类似的数据框:

total = df['numer'].sum()
df['Total'] = np.ones_line(df['number'].values) * total

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据Pandas中另一列的值添加新列(python) - Adding a new column based on values of another column in Pandas(python) 添加新列,基于另一列 pandas 的值 - Adding new column, based on values of another column pandas 由另一列 pandas 分组的列中的总和值 - sum values in column grouped by another column pandas 使用当前行值过滤 pandas 列并将另一列求和以形成新列 - Filter pandas column with current row values and sum another column to form a new column 一列值的和基于另一个列的每个值,然后将其除以总计 - Sum values of a column for each value based on another column and divide it by total Pandas 数据框 - 将前一列中与特定条件匹配的所有值相加并将其添加到新列中 - Pandas Data Frame - Sum all the values in a previous column which match a specific condition and add it to a new column 如果另一个具有字符串 Pandas,则将值加到列中 - Sum values to column if another have a string Pandas 通过添加新列的条件总和值 - Conditional sum values by adding a new column 将pandas数据框的特定列的所有值相加,然后用总和除以索引(pandas,python)-在数据帧中添加另一个列 - Adding all values for a particular column of a pandas data frame and dividing by sum by index (pandas, python) - ADDING AS ANOTHER COLUMN IN DATAFRAME Pandas:使用 for 循环添加新列和值 - Pandas: Adding a new column and values with a for loop
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM