简体   繁体   English

匹配正则表达式中的相似字符串

[英]Match similar string in regex

I want to find out if is it possible to match one of two or more similar lines.我想知道是否有可能匹配两条或多条相似线之一。

Strings to be matched:要匹配的字符串:

Its a string
Its a string
Its a string

Excepted result:异常结果:

Its a string

Everything I tried just select every line, because they are absolutely similar.我尝试的所有内容都只是 select 每一行,因为它们绝对相似。

Is it possible to always keep one similar line unmatched?是否可以始终保持一条相似的线不匹配?

I'm not 100% sure that this will work for you, but it does what I think you're trying to do.我不能 100% 确定这对你有用,但它可以满足我认为你正在尝试做的事情。

import re
p = re.compile(r'(^.+$)((.|\n|r)*)^\1$', re.MULTILINE)
result = p.search(string)

repeated_line = result.groups()[0].strip()

You need to specify re.MULTILINE so that it works with capturing ^$ characters.您需要指定 re.MULTILINE 以便它可以捕获 ^$ 字符。

Here's a quick brake-down of the regex:这是正则表达式的快速制动:

(^.+$)          # Matches a full line and captures it into '\1'  
((.|\n|\r)*)    # Matches any number of characters/newlines  
^\1$            # Matches the first capturing group ensuring that the second occurrence fills a line and has it's own line.

There's probably better ways to do this, but this is the first solution I thought up that specifically uses regex.可能有更好的方法来做到这一点,但这是我想到的第一个专门使用正则表达式的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM