简体   繁体   English

如何在 Python 中将两个字符串列表与正则表达式匹配

[英]How to match two lists of strings with regex in Python

I have two lists of strings in Python.我在 Python 中有两个字符串列表。 One of them is a list of desired strings, the other is a larger list of different strings.其中一个是所需字符串的列表,另一个是更大的不同字符串列表。 For example:例如:

desired = ["cat52", "dog64"]
buf = ["horse101", "elephant5", "dog64", "mouse90", "cat52"]

I need a True/False for whether the second list contains all the strings in the first list.我需要一个 True/False 来判断第二个列表是否包含第一个列表中的所有字符串。 So far I did this with:到目前为止,我这样做了:

if all(element in buf for element in desired)

However, now I need the list of desired strings to have some regex properties.但是,现在我需要所需字符串的列表来具有一些正则表达式属性。 For example:例如:

desired = ["cat52", "dog[0-9]+"]

I've looked into the re and regex python libraries but I can't figure out a statement that gives me what I want.我已经查看了reregex python 库,但我无法找出一个能够满足我想要的声明。 Any help would be appreciated.任何帮助,将不胜感激。

You need to test whether any of the strings in buf match each regex in desired , and then return True if all of them do:您需要测试buf中的any字符串是否与desired中的每个正则表达式匹配,然后如果它们all匹配则返回True

import re

buf = ["horse101", "elephant5", "dog64", "mouse90", "cat52"]
desired = ["cat52", "dog[0-9]+"]

print(all(any(re.match(d + '$', b) for b in buf) for d in desired))

Output: Output:

True

Note that we add $ to the regex so that (for example) dog[0-9]+ will not match dog4a (adding ^ to the beginning is not necessary as re.match anchors matches to the start of the string).请注意,我们将$添加到正则表达式,以便(例如) dog[0-9]+不会匹配dog4a (在开头添加^不是必需的,因为re.match锚匹配到字符串的开头)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM