将系列附加为行数据框熊猫（Python 3.4）

Question

Suppose I have a data frame like: 假设我有一个数据框，例如：

df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"]), })

This looks like 这看起来像

    A   B           C   D   E        
0   1   2013-01-02  1   3   test    
1   1   2013-01-02  1   3   train   
2   1   2013-01-02  1   3   test    
3   1   2013-01-02  1   3   train

I want to append a "Totals" row for numeric columns and put in "Totals" in Column E. 我想为数字列添加一个“总计”行，并在列E中添加“总计”。

So what I have is: 所以我有：

totals=pd.Series('Total', index=['E'])
totals = df2.sum(numeric_only=True).append(totals)

which yields 产量

totals
A        4
C        4
D       12
E    Total
dtype: object

So if I try 所以如果我尝试

df2.append(totals, ignore_index=True)

I get 我明白了

A   B                       C   D   E
0   1   2013-01-02 00:00:00 1   3   test
1   1   2013-01-02 00:00:00 1   3   train   
2   1   2013-01-02 00:00:00 1   3   test    
3   1   2013-01-02 00:00:00 1   3   train
4   4   NaN                 4   12  NaN

My question here is why doesn't column 'E' have a "totals" and why is it NaN? 我的问题是，为什么“ E”列没有“总计”，为什么它是NaN？

Answer 1

Not sure why, but slight change works. 不知道为什么，但是稍作更改即可。

total = df2.sum()
total = total.append(pd.Series('Total', index=['E']))
df2.append(total, True)

Hope that helps! 希望有帮助！

Answer 2

You have to set categories with category Total by categories=["test","train","Total"] . 您必须设置categories ，类别Total为categories=["test","train","Total"] 。

I think you get NaN , because this category does not exist. 我认为您会得到NaN ，因为该类别不存在。

import pandas as pd
import numpy as np


df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"], 
                                           categories=["test","train","Total"])})


totals=pd.Series('Total', index=['E'])
totals = df2.sum(numeric_only=True).append(totals)
print df2.append(totals, True)
   A          B  C   D      E
0  1 2013-01-02  1   3   test
1  1 2013-01-02  1   3  train
2  1 2013-01-02  1   3   test
3  1 2013-01-02  1   3  train
4  4        NaT  4  12  Total

Answer 3

First of all, you will get a NaN in column E unless it is an existing category (ie 'test' or 'train'). 首先，除非列是现有类别（即“测试”或“培训”），否则您将在E列中获得NaN。 So first we must add your new value Total to the categories, and reassign the result back to the column. 因此，首先我们必须将您的新值Total添加到类别中，然后将结果重新分配回该列。

After doing this, your original method will work. 完成此操作后，您的原始方法将起作用。 However, I believe this is more straightforward approach: 但是，我认为这是更简单的方法：

df2['E'] = df2.E.cat.add_categories('Total')
df2.ix[len(df2)] = df2.sum()
df2.iat[-1, -1] = 'Total'

>>> df2
   A          B  C   D      E
0  1 2013-01-02  1   3   test
1  1 2013-01-02  1   3  train
2  1 2013-01-02  1   3   test
3  1 2013-01-02  1   3  train
4  4        NaT  4  12  Total

将系列附加为行数据框熊猫（Python 3.4）

问题描述

3 个解决方案

解决方案1
0 2016-03-03 22:45:47

解决方案2
0 2016-03-03 23:07:53

解决方案3
0 2016-03-04 02:45:55

将系列附加为行数据框熊猫（Python 3.4）

问题描述

3 个解决方案

解决方案1 0 2016-03-03 22:45:47

解决方案2 0 2016-03-03 23:07:53

解决方案3 0 2016-03-04 02:45:55

解决方案1
0 2016-03-03 22:45:47

解决方案2
0 2016-03-03 23:07:53

解决方案3
0 2016-03-04 02:45:55