根據其他列中的布爾值添加新列

Question

我正在嘗試根據另一列中的布爾值向DataFrame添加新列。

給定這樣一個DataFrame：

snr = DataFrame({ 'name': ['A', 'B', 'C', 'D', 'E'],  'seniority': [False, False, False, True, False] })

到目前為止，我最遠的是：

def refine_seniority(contact):
    contact['refined_seniority'] = 'Senior' if contact['seniority'] else 'Non-Senior'

snr.apply(refine_seniority)

但我收到此錯誤：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-208-0694ebf79a50> in <module>()
      2     contact['refined_seniority'] = 'Senior' if contact['seniority'] else 'Non-Senior'
      3 
----> 4 snr.apply(refine_seniority)
      5 
      6 snr

/usr/lib/python2.7/dist-packages/pandas/core/frame.pyc in apply(self, func, axis, broadcast, raw, args, **kwds )
   4414                     return self._apply_raw(f, axis)
   4415                 else:
-> 4416                     return self._apply_standard(f, axis)
   4417             else:
   4418                 return self._apply_broadcast(f, axis)

/usr/lib/python2.7/dist-packages/pandas/core/frame.pyc in _apply_standard(self, func, axis, ignore_failures)
   4489                     # no k defined yet
   4490                     pass
-> 4491                 raise e
   4492 
   4493 

KeyError: ('seniority', u'occurred at index name')

感覺好像我對DataFrames缺少一些基本的了解，但是我陷入了困境。

根據不同列中的布爾值添加新列的正確方法是什么？

Answer 1

您可以創建字典並調用map ：

In [176]:

temp = {True:'senior', False:'Non-senior'}
snr['refined_seniority'] = snr['seniority'].map(temp)
snr
Out[176]:
  name seniority refined_seniority
0    A     False        Non-senior
1    B     False        Non-senior
2    C     False        Non-senior
3    D      True            senior
4    E     False        Non-senior

正如用戶@Jeff指出的那樣，如果可以應用矢量化解決方案，則使用map或apply應該是最后的選擇。

或使用numpy where

In [178]:

snr['refined_seniority'] = np.where(snr['seniority'] == True, 'senior', 'Non-senior')
snr
Out[178]:
  name seniority refined_seniority
0    A     False        Non-senior
1    B     False        Non-senior
2    C     False        Non-senior
3    D      True            senior
4    E     False        Non-senior

如果您將函數修改為此，那么它將起作用：

In [187]:

def refine_seniority(contact):
    if contact == True:
        return 'senior'
    else:
        return 'Non-senior'

snr['refined_seniority'] = snr['seniority'].apply(refine_seniority)
snr
Out[187]:
  name seniority refined_seniority
0    A     False        Non-senior
1    B     False        Non-senior
2    C     False        Non-senior
3    D      True            senior
4    E     False        Non-senior

您所寫的內容不正確，您在df上調用了apply，但是作為標簽的列不存在，請參見下文：

In [193]:

def refine_seniority(contact):
    print(contact)


snr['refined_seniority'] = snr.apply(refine_seniority)

0    A
1    B
2    C
3    D
4    E
Name: name, dtype: object
0    False
1    False
2    False
3     True
4    False
Name: seniority, dtype: object

在這里，您可以看到它輸出了2個熊貓系列，沒有用於“ seniority”的鍵值，因此是錯誤。

Answer 2

snr['refine_seniority']= snr['seniority'].map({True:'senior', False:'Non-senior'})

根據其他列中的布爾值添加新列

問題描述

2 個解決方案

解決方案1
2 已采納 2014-08-29 14:18:29

解決方案2
0 2018-10-29 14:18:36

根據其他列中的布爾值添加新列

問題描述

2 個解決方案

解決方案1 2 已采納 2014-08-29 14:18:29

解決方案2 0 2018-10-29 14:18:36

解決方案1
2 已采納 2014-08-29 14:18:29

解決方案2
0 2018-10-29 14:18:36