简体   繁体   English

将列总计附加到 Pandas DataFrame

[英]Appending column totals to a Pandas DataFrame

I have a DataFrame with numerical values.我有一个带有数值的 DataFrame。 What is the simplest way of appending a row (with a given index value) that represents the sum of each column?附加表示每列总和的行(具有给定索引值)的最简单方法是什么?

要添加一个Total列,该列是该行的总和:

df['Total'] = df.sum(axis=1)

要添加带有列总计的行:

df.loc['Total']= df.sum()

** Get Both Column Total and Row Total ** ** 获取列总数和行总数 **

This gives total on both rows and columns:这给出了行和列的总数:

import numpy as np
import pandas as pd


df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

df.loc['Column_Total']= df.sum(numeric_only=True, axis=0)
df.loc[:,'Row_Total'] = df.sum(numeric_only=True, axis=1)

print(df)

                 a      b    c  Row_Total
0             10.0  100.0    a      110.0
1             20.0  200.0    b      220.0
Column_Total  30.0  300.0  NaN      330.0

One way is to create a DataFrame with the column sums, and use DataFrame.append(...).一种方法是使用列总和创建一个 DataFrame,并使用 DataFrame.append(...)。 For example:例如:

import numpy as np
import pandas as pd
# Create some sample data
df = pd.DataFrame({"A": np.random.randn(5), "B": np.random.randn(5)}) 
# Sum the columns:
sum_row = {col: df[col].sum() for col in df}
# Turn the sums into a DataFrame with one row with an index of 'Total':
sum_df = pd.DataFrame(sum_row, index=["Total"])
# Now append the row:
df = df.append(sum_df)

I have done it this way:我是这样做的:

df = pd.concat([df,pd.DataFrame(df.sum(axis=0),columns=['Grand Total']).T])

this will add a column of totals for each row:这将为每一行添加一列总计:

df = pd.concat([df,pd.DataFrame(df.sum(axis=1),columns=['Total'])],axis=1)

It seems a little annoying to have to turn the Series object (or in the answer above, dict ) back into a DataFrame and then append it, but it does work for my purpose.必须将Series对象(或在上面的答案中, dict )转回 DataFrame 然后附加它似乎有点烦人,但它确实符合我的目的。

It seems like this should just be a method of the DataFrame - like pivot_table has margins.看起来这应该只是DataFrame一种方法 - 比如 pivot_table 有边距。

Perhaps someone knows of an easier way.也许有人知道更简单的方法。

You can use the append method to add a series with the same index as the dataframe to the dataframe.您可以使用append方法将与数据帧具有相同索引的系列添加到数据帧。 For example:例如:

df.append(pd.Series(df.sum(),name='Total'))
  1. Calculate sum and convert result into list(axis=1:row wise sum, axis=0:column wise sum)计算总和并将结果转换为列表(axis=1:row wise sum,axis=0:column wise sum)
  2. Add result of step-1, to the existing dataFrame with new name将步骤 1 的结果添加到具有新名称的现有数据帧中
new_sum_col = list(df.sum(axis=1))
df['new_col_name'] = new_sum_col

I did not find the modern pandas approach!我没有找到现代熊猫的方法! This solution is a bit dirty due to two chained transposition, I do not know how to use .assign on rows.由于两个链式换位,这个解决方案有点脏,我不知道如何.assign上使用.assign

# Generate DataFrame
import pandas as pd
df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

# Solution
df.T.assign(Total = lambda x: x.sum(axis=1)).T

output:输出:

    a    b  c  Total
0  10  100  a    110
1  20  200  b    220

For those that have trouble because the result is 0<\/code> or NaN<\/code> , check dtype<\/code> first.对于那些因为结果为0<\/code>或NaN<\/code>而遇到问题的人,请先检查dtype<\/code> 。

df.dtypes

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM