[英]How to compare two columns and return value from a third column in Pandas dataframe
sample data样本数据
sample = ({'name':['Delinquency Rate','Cumulative Probability'],
'value':['Dlnqy','Prbblty'],
'new_name':['Dlnqncy Rt','Cmltv Prbblty']})
test = pd.DataFrame(sample)
test
new_name is created by removing all the vowels from 'name'. new_name 是通过从“name”中删除所有元音创建的。 I want to compare first 3 characters of 'value' and first 3 character of 'each of the word' in 'new_name' and if they match I want to return the value from the name column.
我想比较“新名称”中“值”的前 3 个字符和“每个单词”的前 3 个字符,如果它们匹配,我想从名称列中返回值。 For example 'Dln' in value exists in 'Dlnqncy' of new_name, so we will return 'Delinquency' from name.
例如'Dln' in value 存在于new_name 的'Dlnqncy' 中,所以我们将从name 中返回'Delinquency'。 The logic should work like
逻辑应该像
'if value[:3] in new_name[:3] then return name'
Following is what I have so far, which works fine if I have only two columns.以下是我到目前为止所拥有的,如果我只有两列,它就可以正常工作。 But it doesn't work if I want to compare value and new_name and retun name.
但是如果我想比较 value 和 new_name 并重新调整名称,它就不起作用了。
def get_matches(name, value, new_name, default=''):
return next( (word for word in new_name.split() if str(value)[:3] in word[:3]),default)
test['match'] = test[['name', 'value', 'new_name']].apply(lambda row: get_matches(*row, default=' '), axis=1)
In the following table the column 'match' should have 'Delinquency' and 'Probability' (as it is there in the 'name' column)在下表中,“匹配”列应该有“拖欠”和“概率”(因为它在“名称”列中)
Here you go:给你go:
(test
.assign(match=lambda x: np.where((x.new_name.str[:3] == x.value.str[:3]), x.name, x.value))
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.