簡體   English   中英

RegEx用於識別具有特殊字符和邊界的字母數字模式

[英]RegEx for identifying alphanumeric patterns with special chars and boundaries

我試圖返回以下字符串之一(取決於輸入):

f23/24  /or/  f23-24   /or/  f23+24

(理想情況下,始終返回格式f23-24會很棒),無論輸入如何

從這種類型的字符串:

build-f23/24 1st pass demo (50:50)   #Should output f23-24 or f23/24
build-f17-22 1st pass demo (50:50)   #Should output f17-22
build-f-1 +14 1st pass demo (50:50)  #Should output f1-14 or f1+14

例外:

一些字符串將沒有第二組數字:

build-f45 1st pass demo (50:50)      #Should output f45


我目前在的位置:

到目前為止,我有這個正則表達式, 但是如果分隔符char是一個斜杠它總是會失敗

regex = r"(\s?)(\-?)(f)(\s?)([\+\-\/]?)(\d\d*)(-?)(\d?\d*)"
tmp = re.search(regex, val)[0]

對於測試數據,您可以嘗試以下正則表達式-(f)-?(\\d+)(?:\\s*([-+/]\\d+))?

import re

val = '''
build-f23/24 1st pass demo (50:50)
build-f17-22 1st pass demo (50:50)
build-f-1 +14 1st pass demo (50:50)
build-f45 1st pass demo (50:50)
'''

expected = [['f23-24', 'f23/24'], ['f17-22'], ['f1-14', 'f1+14'], ['f45']]

for m, x in zip(re.findall(r'-(f)-?(\d+)(?:\s*([-+/]\d+))?', val), expected):
  result = ''.join(m)
  print(result in x, ':', result)

輸出:

True : f23/24
True : f17-22
True : f1+14
True : f45

這是一個非常復雜的表達式,我不確定我是否了解比例,但是也許讓我們從一個表達式開始輸出所需的內容,也許我們可以逐步解決問題:

.+?(-.+?)([a-z][0-9]+?)?\s|(?:[+][0-9])?([0-9]+)?(.+)

測試

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r".+?(-.+?)([a-z][0-9]+?)?\s|(?:[+][0-9])?([0-9]+)?(.+)"

test_str = ("build-f23/24 1st pass demo (50:50)\n"
    "build-f17-22 1st pass demo (50:50)\n"
    "build-f-1 +14 1st pass demo (50:50)")

subst = "\\1\\2\\3"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

在此處輸入圖片說明

演示

在此處輸入圖片說明

import re

dat = """build-f23/24 1st pass demo (50:50)
      build-f17-22 1st pass demo (50:50)
      build-f-1 +14 1st pass demo (50:50)
      build-f45 1st pass demo (50:50)"""

rgx = r'(?mi)^.*(?<=-)(f)\D?(\d+)(?:\s?([+\/-]\d+))?.*$'
re.sub(rgx,r'\1\2\3',dat).split()
['f23/24', 'f17-22', 'f1+14', 'f45']

或者你可以做:

rgx1 = r'(?mi)^.*(?<=-)(f)\D?(\d+)(?:\s?[+\/-](\d+))?.*$'
re.sub('(?m)-$','',re.sub(rgx1 ,r'\1\2-\3',dat)).split()
['f23-24', 'f17-22', 'f1-14', 'f45']

或代替使用sub兩次,您可以直接替換:

re.sub(rgx1,lambda x: f'{x.group(1)}{x.group(2)}-{x.group(3)}' 
                         if x.group(3) else f'{x.group(1)}{x.group(2)}',dat).split()
['f23-24', 'f17-22', 'f1-14', 'f45']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM