簡體   English   中英

組合正則表達式模式以匹配字符串的開頭和結尾並刪除分隔符

[英]Combined regex pattern to match beginning and end of string and remove a separator character

我有以下字符串:

"LP, bar, company LLP, foo, LLP"
"LLP, bar, company LLP, foo, LP"
"LLP,bar, company LLP, foo,LP"  # note the absence of a space after/before comma to be removed

我正在尋找一個接受這些輸入並返回以下內容的正則表達式:

"LP bar, company LLP, foo LLP"
"LLP bar, company LLP, foo LP"
"LLP bar, company LLP, foo LP"

我這么胖的是這樣的:

import re

def fix_broken_entity_names(name):
    """
    LLP, NAME -> LLP NAME
    NAME, LP -> NAME LP
    """
    pattern_end = r'^(LL?P),'
    pattern_beg_1 = r', (LL?P)$'
    pattern_beg_2 = r',(LL?P)$'
    combined = r'|'.join((pattern_beg_1, pattern_beg_2, pattern_end))
    return re.sub(combined, r' \1', name)

當我運行它時:

>>> fix_broken_entity_names("LP, bar, company LLP, foo,LP")
Out[1]: '  bar, company LLP, foo '

我會非常感謝任何提示或解決方案:)

您可以使用

import re
texts = ["LP, bar, company LLP, foo, LLP","LLP, bar, company LLP, foo, LP","LLP,bar, company LLP, foo,LP"]
for text in texts:
    result = ' '.join(re.sub(r"^(LL?P)\s*,|,\s*(LL?P)$", r" \1\2 ", text).split())
    print("'{}' -> '{}'".format(text, result))

Output:

'LP, bar, company LLP, foo, LLP' -> 'LP bar, company LLP, foo LLP'
'LLP, bar, company LLP, foo, LP' -> 'LLP bar, company LLP, foo LP'
'LLP,bar, company LLP, foo,LP' -> 'LLP bar, company LLP, foo LP'

請參閱Python 演示 正則表達式^(LL?P)\s*,|,\s*(LL?P)$

  • ^(LL?P)\s*, - 字符串開頭, LLPLP (第 1 組),零個或多個空格,逗號
  • | - 或者
  • ,\s*(LL?P)$ - 逗號、零個或多個空格、 LPLLP (第 2 組),然后是字符串。

請注意,替換是包含在單個空格內的第 1 組和第 2 組值的串聯,后處理步驟是刪除所有前導/尾隨空格並將字符串內的空格縮小為單個空格。

利用捕獲組並按照您的意願重新格式化:

正則表達式:

([^,\r\n]+) *, *([^,\r\n]+) *, *([^,\r\n]+) *, *([^,\r\n]+) *, *([^,\r\n]+)

替代品

\1 \2, \3, \4 \5

https://regex101.com/r/jcEzzy/1/

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM