[英]Apply function to each cell in DataFrame that depends on the column name in pandas
如何將函數應用於依賴於列名稱的DataFrame中的每個單元格?
我知道pandas.DataFrame.applymap但它似乎不允許取決於列名:
import numpy as np
import pandas as pd
np.random.seed(1)
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'),
index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
format = lambda x: '%.2f' % x
frame = frame.applymap(format)
print(frame)
收益:
b d e
Utah 1.624345 -0.611756 -0.528172
Ohio -1.072969 0.865408 -2.301539
Texas 1.744812 -0.761207 0.319039
Oregon -0.249370 1.462108 -2.060141
b d e
Utah 1.62 -0.61 -0.53
Ohio -1.07 0.87 -2.30
Texas 1.74 -0.76 0.32
Oregon -0.25 1.46 -2.06
相反,我希望我應用於每個單元格的函數使用當前單元格的列名作為參數。
我不想讓自己遍布每一列,例如:
def format2(cell_value, column_name):
return '{0}_{1:.2f}'.format(column_name, cell_value)
for column_name in frame.columns.values:
print('column_name: {0}'.format(column_name))
frame[column_name]=frame[column_name].apply(format2, args=(column_name))
print(frame)
返回:
b d e
Utah b_1.62 d_-0.61 e_-0.53
Ohio b_-1.07 d_0.87 e_-2.30
Texas b_1.74 d_-0.76 e_0.32
Oregon b_-0.25 d_1.46 e_-2.06
(這只是一個例子。我想在單元格上應用的函數可能不只是附加列名稱)
為什么不:
>>> frame
b d e
Utah -0.579869 0.101039 -0.225319
Ohio -1.791191 -0.026241 -0.531509
Texas 0.785618 -1.422460 -0.740677
Oregon 1.302074 0.241523 0.860346
>>> frame['e'] = ['%.2f' % val for val in frame['e'].values]
>>> frame
b d e
Utah -0.579869 0.101039 -0.23
Ohio -1.791191 -0.026241 -0.53
Texas 0.785618 -1.422460 -0.74
Oregon 1.302074 0.241523 0.86
如果您不想遍歷列,可以執行以下操作:
frame.T.apply(lambda x: x.apply(format2,args=(x.name)), axis=1).T
Out[289]:
b d e
Utah b_0.90 d_-0.68 e_-0.12
Ohio b_-0.94 d_-0.27 e_0.53
Texas b_-0.69 d_-0.40 e_-0.69
Oregon b_-0.85 d_-0.67 e_-0.01
轉置df后,列名成為索引,可以使用.name屬性在apply函數中引用。
我改進了另一個答案,默認情況下, axis=0
,因此可以省略:
a = frame.apply(lambda x: x.apply(format2,args=(x.name)))
print (a)
b d e
Utah b_1.62 d_-0.61 e_-0.53
Ohio b_-1.07 d_0.87 e_-2.30
Texas b_1.74 d_-0.76 e_0.32
Oregon b_-0.25 d_1.46 e_-2.06
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.