简体   繁体   English

检查字符串是否在另一列熊猫中

[英]Check if string is in another column pandas

Below is my DF下面是我的 DF

df= pd.DataFrame({'col1': ['[7]', '[30]', '[0]', '[7]'], 'col2': ['[0%, 7%]', '[30%]', '[30%, 7%]', '[7%]']})

col1    col2    
[7]     [0%, 7%]
[30]    [30%]
[0]     [30%, 7%]
[7]     [7%]

The aim is to check if col1 value is contained in col2 below is what I've tried目的是检查 col1 值是否包含在下面的 col2 中是我尝试过的

df['test'] = df.apply(lambda x: str(x.col1) in str(x.col2), axis=1)

Below is the expected output以下是预期的输出

col1    col2       col3
[7]     [0%, 7%]   True
[30]    [30%]      True
[0]     [30%, 7%]  False
[7]     [7%]       True

Use Series.str.extractall for get numbers, reshape by Series.unstack , so possible compare by DataFrame.isin with DataFrame.any :使用Series.str.extractall获取数字,通过Series.unstack重塑,因此可以通过DataFrame.isinDataFrame.any进行比较:

df['test'] = (df['col2'].str.extractall('(\d+)')[0].unstack()
                        .isin(df['col1'].str.strip('[]'))
                        .any(axis=1))
print (df)
   col1       col2   test
0   [7]   [0%, 7%]   True
1  [30]      [30%]   True
2   [0]  [30%, 7%]  False
3   [7]       [7%]   True

You can extract the numbers on both columns and join , then check if there is at least one match per id using eval + groupby + any :您可以提取两列和join上的数字,然后使用eval + groupby + any检查每个 id 是否至少有一个匹配项:

(df['col2'].str.extractall('(?P<col2>\d+)').droplevel(1)
   .join(df['col1'].str[1:-1])
   .eval('col2 == col1')
   .groupby(level=0).any()
)

output:输出:

0     True
1     True
2    False
3     True

One approach:一种方法:

import ast

# convert to integer list
col2_lst = df["col2"].str.replace("%", "").apply(ast.literal_eval)

# check list containment
df["col3"] = [all(bi in a for bi in b)  for a, b in zip(col2_lst, df["col1"].apply( ast.literal_eval)) ]

print(df)

Output输出

   col1       col2   col3
0   [7]   [0%, 7%]   True
1  [30]      [30%]   True
2   [0]  [30%, 7%]  False
3   [7]       [7%]   True

You can also replace the square brackets with word boundaries \\b and use re.search like in您还可以用单词边界\\b替换方括号并使用re.search

import re
#...
df.apply(lambda x: bool(re.search(x['col1'].replace("[",r"\b").replace("]",r"\b"), x['col2'])), axis=1)
# => 0     True
#    1     True
#    2    False
#    3     True
#    dtype: bool

This will work because \\b7\\b will find a match in [0%, 7%] as 7 is neither preceded nor followed with letters, digits or underscores.这会起作用,因为\\b7\\b会在[0%, 7%]找到匹配项,因为7既不前面也不后面跟字母、数字或下划线。 There won't be any match found in [30%, 7%] as \\b0\\b does not match a zero after a digit (here, 3 ).[30%, 7%]中找不到任何匹配项,因为\\b0\\b不匹配数字后的零(此处为3 )。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pandas dataframe 检查列是否包含存在于另一列中的字符串 - pandas dataframe check if column contains string that exists in another column Pandas dataframe 检查字符串的左侧部分是否与列中的另一个条目匹配 - Pandas dataframe check if left part of a string matches another entry in a column 如何检查一个 Pandas 列的字符串值是否包含在另一个 Pandas 列的字符串值中? - How to check whether the string value of a Pandas Column is contained in the string value of another Pandas Column? 如何使用DataFrame和Pandas检查列中的字符串是否是另一列中的子字符串 - How can I check if a string in a column is a sub-string in another column using dataframe and pandas Python Pandas:检查一列中的字符串是否包含在同一行中另一列的字符串中 - Python Pandas: Check if string in one column is contained in string of another column in the same row Pandas dataframe:检查列中包含的正则表达式是否与同一行中另一列中的字符串匹配 - Pandas dataframe: Check if regex contained in a column matches a string in another column in the same row pandas dataframe 如何判断一行的字符串值是否包含在同一列的另一行的字符串值中 - How to check if a string value of one row is contained in the string value of another row in the same column in pandas dataframe 如果另一列包含字符串,请替换pandas中的一列 - Replace a column in pandas if another column contains a string 将pandas列中的字符串与另一个pandas列中的字符串进行比较 - Compare string in a column of pandas with string from another pandas column 检查列中是否存在值并在另一个 Pandas 中更改 - Check if a value exist in a column and change in another Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM