[英]Exact match of lists intersection using regex.findall in Python
I would like to get the intersection of to lists of words using regex. 我想使用正则表达式获取单词列表的交集。 It's C implementation making it runs faster is of huge importance in this particular case... Even though I have a code almost working, it would also match 'embeded-words', like "buyers" and "buy" for exemple.
在这种特殊情况下,使用C语言实现更快的运行非常重要……即使我的代码几乎可以正常工作,它也可以匹配“嵌入式单词”,例如“ buyers”和“ buy”。
Some code probably explains it better. 一些代码可能会更好地解释它。 This is what I have so far:
这是我到目前为止的内容:
re.findall(r"(?=(" + '|'.join(['buy', 'sell', 'gilt']) + r"))", ' '.join(['aabuya', 'gilt', 'buyer']))
>> ['buy', 'gilt', 'buy']
While this is what I would like: 虽然这是我想要的:
re.exactfindall(['buy', 'sell', 'gilt'], ['aabuya', 'gilt', 'buyer'])
>>['gilt']
Thanks. 谢谢。
To do this using regexps, the easiest way is probably to include word breaks ( \\b
) in the matching expression, (outside the catch) giving you: 要使用正则表达式执行此操作,最简单的方法可能是在匹配的表达式中(在catch之外)包括断行符(
\\b
),从而为您提供:
re.findall(r"\b(?=(" + '|'.join(['buy', 'sell', 'gilt']) + r")\b)",
' '.join(['aabuya', 'gilt', 'buyer']))
which outputs ['gilt']
as requested. 根据要求输出
['gilt']
。
listgiven=['aabuya', 'gilt', 'buyer']
listtomatch=['buy', 'sell', 'gilt']
exactmatch = [x for x in listgiven if x in listtomatch]
print(exactmatch)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.