在 Str 和字符串列表之间使用“==”运算符精确匹配字符串

Question

I have this example df:我有这个例子df：

df6 = pd.DataFrame({
                   'answer1': ['Lo', 'New York', 'Toronto'],
                   'answer2': ['London', 'New', 'Paris'],
                   'answer3': ['CA', 'CA', 'CA'],
                   'correct': [['London'], ['New York'], ['Toronto']]
                   })

df6

gives:给出：

    answer1   answer2     answer3     correct
0   Lo         London         CA    [London]
1   New York    New           CA    [New York]
2   Toronto    Paris          CA    [Toronto]

I am trying to get the column name (answer 1 or2.. etc) that contains the text in the correct column in a new column called Answer by matching values in str format.我正在尝试通过匹配str格式的值来获取包含正确列中正确列中的文本的列名（ Answer 1 或 2.. 等）。 The correct column has the data in a list type正确的列具有列表类型的数据

I used the following code to do so:我使用以下代码来执行此操作：

cols = df6.filter(like='answer').columns

df6['Answer'] = df6[cols].apply(lambda s: ', '.join(cols[(m:=[str(s[col]) in str(df6.loc[s.name, 'correct']) for col in cols])]) , axis=1)

But I go inaccurate results:但我 go 结果不准确：

    answer1    answer2     answer3     correct       Answer
0   Lo         London       CA         [London]      answer1, answer2
1   New York   New          CA         [New York]    answer1, answer2
2   Toronto    Paris        CA         [Toronto]     answer1

It should be:它应该是：

    answer1    answer2     answer3     correct       Answer
0   Lo         London       CA         [London]      answer2
1   New York   New          CA         [New York]    answer1
2   Toronto    Paris        CA         [Toronto]     answer1

If I changed in to == the code will not work because the type of data is not comparable (str with list) and also I need to wrap list items in a str to avoid multiple data issues in my original df如果我更改in ==代码将不起作用，因为数据类型不可比较（str 与列表），而且我需要将列表项包装在str中以避免原始 df 中的多个数据问题

I do not know how to achieve this?我不知道如何实现这一点？

Answer 1

Strip correct of corner brackets, check existence in df and then conditionally copy over the columns去除角括号的正确性，检查 df 中是否存在，然后有条件地复制列

 df6['answer'] =df6.isin(df6['correct'].str[0].to_list()).agg(lambda s: s.index[s].values, axis=1)
df6



     answer1 answer2 answer3     correct     answer
0        Lo  London      CA    [London]  [answer2]
1  New York     New      CA  [New York]  [answer1]
2   Toronto   Paris      CA   [Toronto]  [answer1]

Answer 2

I think you should look if the element in the answer column is in the list, not in the string, in the correct column:我认为您应该查看answer列中的元素是否在列表中，而不是在字符串中，在correct的列中：

df6['Answer'] = df6[cols].apply(lambda s: ', '.join(cols[(m:=[str(s[col]) in list(df6.loc[s.name, 'correct']) for col in cols])]) , axis=1)

Should work, as this is checking if the answerX element is in the correct list.应该可以工作，因为这是检查answerX元素是否在correct的列表中。

在 Str 和字符串列表之间使用“==”运算符精确匹配字符串

问题描述

2 个解决方案

解决方案1
1 2022-01-05 23:37:57

解决方案2
0 2022-01-05 23:34:48

在 Str 和字符串列表之间使用“==”运算符精确匹配字符串

问题描述

2 个解决方案

解决方案1 1 2022-01-05 23:37:57

解决方案2 0 2022-01-05 23:34:48

解决方案1
1 2022-01-05 23:37:57

解决方案2
0 2022-01-05 23:34:48