將系列附加為行數據框熊貓（Python 3.4）

Question

假設我有一個數據框，例如：

df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"]), })

這看起來像

    A   B           C   D   E        
0   1   2013-01-02  1   3   test    
1   1   2013-01-02  1   3   train   
2   1   2013-01-02  1   3   test    
3   1   2013-01-02  1   3   train

我想為數字列添加一個“總計”行，並在列E中添加“總計”。

所以我有：

totals=pd.Series('Total', index=['E'])
totals = df2.sum(numeric_only=True).append(totals)

產量

totals
A        4
C        4
D       12
E    Total
dtype: object

所以如果我嘗試

df2.append(totals, ignore_index=True)

我明白了

A   B                       C   D   E
0   1   2013-01-02 00:00:00 1   3   test
1   1   2013-01-02 00:00:00 1   3   train   
2   1   2013-01-02 00:00:00 1   3   test    
3   1   2013-01-02 00:00:00 1   3   train
4   4   NaN                 4   12  NaN

我的問題是，為什么“ E”列沒有“總計”，為什么它是NaN？

Answer 1

不知道為什么，但是稍作更改即可。

total = df2.sum()
total = total.append(pd.Series('Total', index=['E']))
df2.append(total, True)

希望有幫助！

Answer 2

您必須設置categories ，類別Total為categories=["test","train","Total"] 。

我認為您會得到NaN ，因為該類別不存在。

import pandas as pd
import numpy as np


df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"], 
                                           categories=["test","train","Total"])})


totals=pd.Series('Total', index=['E'])
totals = df2.sum(numeric_only=True).append(totals)
print df2.append(totals, True)
   A          B  C   D      E
0  1 2013-01-02  1   3   test
1  1 2013-01-02  1   3  train
2  1 2013-01-02  1   3   test
3  1 2013-01-02  1   3  train
4  4        NaT  4  12  Total

Answer 3

首先，除非列是現有類別（即“測試”或“培訓”），否則您將在E列中獲得NaN。 因此，首先我們必須將您的新值Total添加到類別中，然后將結果重新分配回該列。

完成此操作后，您的原始方法將起作用。 但是，我認為這是更簡單的方法：

df2['E'] = df2.E.cat.add_categories('Total')
df2.ix[len(df2)] = df2.sum()
df2.iat[-1, -1] = 'Total'

>>> df2
   A          B  C   D      E
0  1 2013-01-02  1   3   test
1  1 2013-01-02  1   3  train
2  1 2013-01-02  1   3   test
3  1 2013-01-02  1   3  train
4  4        NaT  4  12  Total

將系列附加為行數據框熊貓（Python 3.4）

問題描述

3 個解決方案

解決方案1
0 2016-03-03 22:45:47

解決方案2
0 2016-03-03 23:07:53

解決方案3
0 2016-03-04 02:45:55

將系列附加為行數據框熊貓（Python 3.4）

問題描述

3 個解決方案

解決方案1 0 2016-03-03 22:45:47

解決方案2 0 2016-03-03 23:07:53

解決方案3 0 2016-03-04 02:45:55

解決方案1
0 2016-03-03 22:45:47

解決方案2
0 2016-03-03 23:07:53

解決方案3
0 2016-03-04 02:45:55