[英]How to apply a function to a column in Pandas depending on the value in another column?
Thank you in advance for reading. 提前感谢您的阅读。
I have a dataframe: 我有一个数据帧:
df = pd.DataFrame({'Words':[{'Sec': ['level']},{'Sec': ['levels']},{'Sec': ['level']},{'Und': ['ba ']},{'Pro': ['conf'],'ProAbb': ['cth']}],'Conflict':[None,None,None,None,'Match Conflict']})
Conflict Words
0 None {u'Sec': [u'level']}
1 None {u'Sec': [u'levels']}
2 None {u'Sec': [u'level']}
3 None {u'Und': [u'ba ']}
4 Match Conflict {u'ProAbb': [u'cth'], u'Pro': [u'conf']}
I want to apply a routine that, for each element in 'Words'
, checks if Conflict = 'Match Conflict'
and if so, applies some function to the value in 'Words'
. 我想应用一个例程,对于
'Words'
每个元素,检查Conflict = 'Match Conflict'
,如果是,则将一些函数应用于'Words'
的值。
For instance, using the following placeholder function: 例如,使用以下占位符函数:
def func(x):
x = x.clear()
return x
I write: 我写:
df['Words'] = df[df['Conflict'] == 'Match Conflict']['Words'].apply(lambda x: func(x))
My expected output is: 我的预期输出是:
Conflict Words
0 None {u'Sec': [u'level']}
1 None {u'Sec': [u'levels']}
2 None {u'Sec': [u'level']}
3 None {u'Und': [u'ba ']}
4 Match Conflict None
Instead I get: 相反,我得到:
Conflict Words
0 None NaN
1 None NaN
2 None NaN
3 None NaN
4 Match Conflict None
The function is applied only to the row which has Conflict = 'Match Conflict'
but at the expense of the other rows (which all become None
. I assumed the other rows would be left untouched; obviously this is not the case. 该函数仅应用于具有
Conflict = 'Match Conflict'
的行,但代价是其他行(全部变为None
。我假设其他行保持不变;显然情况并非如此。
Can you explain how I might achieve my desired output without dropping all of the information in the Words
column? 你能解释一下如何在不丢弃
Words
列中的所有信息的情况下实现我想要的输出吗? I believe the answer may lie with np.where
but I have not been able to make this work, this was the best I could come up with. 我相信答案可能在于
np.where
但我无法做到这一点,这是我能想到的最好的。
Any help much appreciated. 任何帮助非常感谢。 Thanks.
谢谢。
You should rewrite the function to work with all of your rows: 您应该重写该函数以使用所有行:
def func(x, match):
if x['Conflict'] == match:
return None
return x['Words']
df['Words'] = df.apply(lambda row: func(row, 'Match Conflict'), axis=1)
You can also use where
as you described, 您还可以使用
where
像你描述的,
condition = df.Conflict != 'Match Conflict'
df['Words'] = df.Words.where(condition, None)
Conflict Words
0 None {u'Sec': [u'level']}
1 None {u'Sec': [u'levels']}
2 None {u'Sec': [u'level']}
3 None {u'Und': [u'ba ']}
4 Match Conflict None
suppose a placeholder 假设一个占位符
def func(x):
x = x.clear()
return x
Then we can use boolean indexing and apply to obtain the desired output. 然后我们可以使用布尔索引并应用以获得所需的输出。
df.ix[df['Conflict']=='Match Conflict', 'Words'].apply(func)
I wanted to provide a concise one-liner but I was too late :,( 我想提供一个简洁的单行,但我太晚了:,(
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.