简体   繁体   English

正则表达式替换除模式之外的所有内容

[英]Regex Replace All But Pattern

This might be a duplicate, but I'm trying to replace all but a certain string pattern.这可能是重复的,但我试图替换除某个字符串模式之外的所有内容。 Here is a sample of strings:以下是字符串示例:

'dkas;6-17'
'dsajdl 10'
'dsjalkdj16-20'

The goal here is to replace anything that is not numbers-numbers with nothing.这里的目标是用任何东西替换任何不是数字的东西。 So what I'd get from the strings above are:所以我从上面的字符串中得到的是:

'6-17'
''
'16-20'

The second string would yield nothing because it didn't match the pattern.第二个字符串不会产生任何结果,因为它与模式不匹配。 I know the regular expression to match my pattern, but I'm confused about how I'd use regexp_replace to match all but that pattern.我知道匹配我的模式的正则表达式,但我对如何使用 regexp_replace 匹配除该模式之外的所有内容感到困惑。 The following is what I have, but this replaces the pattern I want to actually keep.以下是我所拥有的,但这取代了我想要实际保留的模式。

re.sub('[0-9]{1,2}\-[0-9]{1,2}', '', text)

If you mean by the second would yield nothing, you could match any char except a digit or newline, followed by capturing the pattern in a group.如果您的意思是第二个不会产生任何结果,您可以匹配除数字或换行符之外的任何字符,然后捕获组中的模式。

If the sub should leave an empty string, you could match the whole line using an alternation.如果 sub 应该留下一个空字符串,您可以使用交替匹配整行。

[^\d\r\n]+(\d{1,2}-\d{1,2})|.+

In parts在零件

  • [^\d\r\n]+ Match 1+ times any char except a digit or a newline [^\d\r\n]+匹配除数字或换行符以外的任何字符 1+ 次
  • (\d{1,2}-\d{1,2}) Capture group 1 , match 1-2 digits, - and 1-2 digits (\d{1,2}-\d{1,2})捕获组 1 ,匹配 1-2 位, -和 1-2 位
  • | Or或者
  • .+ Match any char except a newline 1+ more times .+匹配除换行符以外的任何字符 1 次以上

Regex demo |正则表达式演示| Python demo Python 演示

Example code示例代码

import re

lines = [
    'dkas;6-17',
    'dsajdl 10',
    'dsjalkdj16-20'
]

for text in lines:
    print(re.sub('[^\d\r\n]+(\d{1,2}-\d{1,2})|.+', r'\1', text))

Output Output

6-17

16-20

How about just looking for all the matches in the string and concatenating them together?只查找字符串中的所有匹配项并将它们连接在一起怎么样?

>>> ''.join(re.findall('[0-9]{1,2}\-[0-9]{1,2}', 'dkas;6-17abc19-10'))
'6-1719-10'

>>> ''.join(re.findall('[0-9]{1,2}\-[0-9]{1,2}', 'dsajdl 10'))
''

Consider matching考虑匹配

\d+-\d+|$

Demo演示

If the string were如果字符串是

dkas;6-17

the first match would be 6-17 , the second would be the empty string at the end of the line.第一个匹配是6-17 ,第二个是行尾的空字符串。

If the string were如果字符串是

dsjalkdj16-20kl21-33mn

there would be three matches, 16-20 , 21-33 and the empty space at the end of the line.将有三场比赛, 16-2021-33和行尾的空白区域。

If the string were如果字符串是

dsajdl 10

the first (and only) match would be the empty string at the end of the line.第一个(也是唯一的)匹配将是行尾的空字符串。

It follows that if there it one match it will be the empty string at the end of the string, which is to be returned;因此,如果有一个匹配项,它将是字符串末尾的空字符串,该字符串将被返回; else, return the first, or all but the last, match(es), depending on requirements.否则,根据要求返回第一个或除最后一个之外的所有匹配项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM