简体   繁体   English

在python中使用regex.findall进行列表交集的精确匹配

[英]Exact match of lists intersection using regex.findall in Python

I would like to get the intersection of to lists of words using regex. 我想使用正则表达式获取单词列表的交集。 It's C implementation making it runs faster is of huge importance in this particular case... Even though I have a code almost working, it would also match 'embeded-words', like "buyers" and "buy" for exemple. 在这种特殊情况下,使用C语言实现更快的运行非常重要……即使我的代码几乎可以正常工作,它也可以匹配“嵌入式单词”,例如“ buyers”和“ buy”。

Some code probably explains it better. 一些代码可能会更好地解释它。 This is what I have so far: 这是我到目前为止的内容:

re.findall(r"(?=(" + '|'.join(['buy', 'sell', 'gilt']) + r"))", ' '.join(['aabuya', 'gilt', 'buyer']))
>> ['buy', 'gilt', 'buy']

While this is what I would like: 虽然这是我想要的:

re.exactfindall(['buy', 'sell', 'gilt'], ['aabuya', 'gilt', 'buyer'])
>>['gilt']

Thanks. 谢谢。

To do this using regexps, the easiest way is probably to include word breaks ( \\b ) in the matching expression, (outside the catch) giving you: 要使用正则表达式执行此操作,最简单的方法可能是在匹配的表达式中(在catch之外)包括断行符( \\b ),从而为您提供:

re.findall(r"\b(?=(" + '|'.join(['buy', 'sell', 'gilt']) + r")\b)",
    ' '.join(['aabuya', 'gilt', 'buyer']))

which outputs ['gilt'] as requested. 根据要求输出['gilt']

listgiven=['aabuya', 'gilt', 'buyer']
listtomatch=['buy', 'sell', 'gilt']
exactmatch = [x for x in listgiven if x in listtomatch]
print(exactmatch)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM