简体   繁体   English

检查两个字符串是否在python中包含相同的模式

[英]Check if two strings contain the same pattern in python

I have the following list: 我有以下清单:

names = ['s06_215','s06_235b','s06_235','s08_014','18:s08_014','s08_056','s08_169']

s06_235b and s06_235 , s08_014 and 18:s08_014 are duplicated. s06_235bs06_235s08_01418:s08_014是重复的。 However, as shown in the example, there is no specific pattern in the naming. 但是,如示例中所示,命名中没有特定的模式。 I need to do a pairwise comparison of the element of the list: 我需要对列表的元素进行成对比较:

for i in range(0, len(names)-1):
    for index, value in enumerate(names):
        print names[i], names[index]

I need then to check for each pair, if the two, contain the same string but with length more than 4 . 然后,我需要检查每一对,如果两者都包含相同的字符串,但长度大于4 That is s06_235b and s06_235 , and s08_014 and 18:s08_014 would pass this criterion but s08_056 and s08_169 would not. s06_235bs06_235s08_01418:s08_014将通过此标准,但s08_056s08_169将不通过。

How can I achieve this in Python? 如何在Python中实现?

You could iterate all the combinations , join them with some special character that can not be part of those strings, and use a regular expression like (\\w{5,}).*#.*\\1 to find a repeated group in that pair. 您可以迭代所有combinations ,使用不能包含在这些字符串中的某些特殊字符将它们join起来,并使用诸如(\\w{5,}).*#.*\\1类的正则表达式来查找其中的重复组对。 Other than just testing with s1 in s2 , this will also work if just a part of the first string is contained in the second, or vice versa. 除了仅s1 in s2使用s1 in s2测试之外,如果仅在第二个字符串中包含第一个字符串的一部分,这也将起作用,反之亦然。

Here, (\\w{5,}) is the shared substring of at least 5 characters (from the \\w class in this case, but feel free to adapt), followed by more characters .* the separator ( # in this case), more filler .* and then another instance of the first group \\1 . 此处, (\\w{5,})是至少5个字符的共享子字符串(在这种情况下为\\w类,但可以随意调整),后跟更多的字符.*分隔符(在这种情况下为# ) ,更多填充符.* ,然后是第一组\\1另一个实例。

p = re.compile(r"(\w{5,}).*#.*\1")
for pair in itertools.combinations(names, 2):
    m = p.search("#".join(pair))
    if m:
        print("%r shares %r" % (pair, m.group(1)))

Output: 输出:

('s06_215', 's06_235b') shares 's06_2'
('s06_215', 's06_235') shares 's06_2'
('s06_235b', 's06_235') shares 's06_235'
('s08_014', '18:s08_014') shares 's08_014'
('s08_014', 's08_056') shares 's08_0'
('18:s08_014', 's08_056') shares 's08_0'

Of course, you can tweak the regex to fit your needs. 当然,您可以调整正则表达式以满足您的需求。 Eg, if you do not want the repeated region to be bounded by _ , you could use a regex like p = r"([a-z0-9]\\w{3,}[a-z0-9]).*#.*\\1" . 例如,如果您不希望重复区域受_限制,则可以使用正则表达式,例如p = r"([a-z0-9]\\w{3,}[a-z0-9]).*#.*\\1"

You can use an 'in' operator to see if on variable contains another 您可以使用“ in”运算符来查看on变量是否包含另一个

if "example" in "this is an example":

Try this: 尝试这个:

for i in range(0, len(names)-1):
    for index, value in enumerate(names):
       if names[i] in names[index] and len(names[i]) > 4:
          print names[i], names[index]

Edit: As tobias_k mention: Note that this only works if the entire string is contained in the other string 编辑:如tobias_k提及:请注意,这仅在整个字符串包含在另一个字符串中时才有效

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查两个字符串是否在Python中包含相同的单词集 - Check if two strings contain the same set of words in Python 如何在python中检查2个字符串是否包含相同的字母和数字? - How to check if 2 strings contain same letters and numbers in python? Python:检查两个数据框是否在同一位置包含填充单元格 - Python: check if two dataframes contain filled cells in the same location 有没有办法检查两个 object 在 python 中的每个变量中是否包含相同的值? - Is there a way to check if two object contain the same values in each of their variables in python? 有没有一种方法可以检查两个列表是否在Python中包含相同的值? - Is there a way to check if two lists contain any of the same values in Python? Python:检查两个数组(可能包含重复元素)是否包含相同的元素集 - Python: Check if two arrays (may contain repeated elements) contain the same set of elements 检查通用字母是否在Python中两个字符串中的相同位置? - Check if common letters are at the same position in two strings in Python? Python 检查多个字符串之一是否包含 substring - Python check if one of multiple strings contain a substring 如何检查两个字符串是否相同 - How to check if two strings are the same Python:比较2个字符串,看看它们是否包含相同的字母 - Python: Compare 2 strings and see if they contain the same letters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM