[英]Exact matching string with "==" operator between Str and a list of strings
I have this example df:我有这个例子df:
df6 = pd.DataFrame({
'answer1': ['Lo', 'New York', 'Toronto'],
'answer2': ['London', 'New', 'Paris'],
'answer3': ['CA', 'CA', 'CA'],
'correct': [['London'], ['New York'], ['Toronto']]
})
df6
gives:给出:
answer1 answer2 answer3 correct
0 Lo London CA [London]
1 New York New CA [New York]
2 Toronto Paris CA [Toronto]
I am trying to get the column name (answer 1 or2.. etc) that contains the text in the correct column in a new column called Answer
by matching values in str
format.我正在尝试通过匹配
str
格式的值来获取包含正确列中正确列中的文本的列名( Answer
1 或 2.. 等)。 The correct column has the data in a list type正确的列具有列表类型的数据
I used the following code to do so:我使用以下代码来执行此操作:
cols = df6.filter(like='answer').columns
df6['Answer'] = df6[cols].apply(lambda s: ', '.join(cols[(m:=[str(s[col]) in str(df6.loc[s.name, 'correct']) for col in cols])]) , axis=1)
But I go inaccurate results:但我 go 结果不准确:
answer1 answer2 answer3 correct Answer
0 Lo London CA [London] answer1, answer2
1 New York New CA [New York] answer1, answer2
2 Toronto Paris CA [Toronto] answer1
It should be:它应该是:
answer1 answer2 answer3 correct Answer
0 Lo London CA [London] answer2
1 New York New CA [New York] answer1
2 Toronto Paris CA [Toronto] answer1
If I changed in
to ==
the code will not work because the type of data is not comparable (str with list) and also I need to wrap list items in a str
to avoid multiple data issues in my original df如果我更改
in
==
代码将不起作用,因为数据类型不可比较(str 与列表),而且我需要将列表项包装在str
中以避免原始 df 中的多个数据问题
I do not know how to achieve this?我不知道如何实现这一点?
Strip correct of corner brackets, check existence in df and then conditionally copy over the columns去除角括号的正确性,检查 df 中是否存在,然后有条件地复制列
df6['answer'] =df6.isin(df6['correct'].str[0].to_list()).agg(lambda s: s.index[s].values, axis=1)
df6
answer1 answer2 answer3 correct answer
0 Lo London CA [London] [answer2]
1 New York New CA [New York] [answer1]
2 Toronto Paris CA [Toronto] [answer1]
I think you should look if the element in the answer
column is in the list, not in the string, in the correct
column:我认为您应该查看
answer
列中的元素是否在列表中,而不是在字符串中,在correct
的列中:
df6['Answer'] = df6[cols].apply(lambda s: ', '.join(cols[(m:=[str(s[col]) in list(df6.loc[s.name, 'correct']) for col in cols])]) , axis=1)
Should work, as this is checking if the answerX
element is in the correct
list.应该可以工作,因为这是检查
answerX
元素是否在correct
的列表中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.