簡體   English   中英

如何在標記(正則表達式)之間用部分字符串替換字符串?

[英]How to replace a string with parts of it between markers (regex)?

我有一個文本,讓我們說:

Lorem ipsum dolor sit [[amet]], consectetur adipiscing elit, 
sed do eiusmod [[time (sample)|tempor]]  incididunt ut [[labore]] et dolore magna aliqua.

我想用tempor替換[[time (sample)|tempor]] 結構始終相同: [[string to remove|string to extract]] ,並且可以在文本中出現多次。

我在正則表達式中嘗試了正則表達式,但沒有截斷一半文本就沒有成功: re.sub(r'\[.*?\|', '', text)

如何替換字符串?

您可以使用以下正則表達式僅收集相關字段

r'\[\[[\w\s\(\)]+?\|(.+?)\]\]'
import re
regex = r'\[\[[\w\s\(\)]+?\|(.+?)\]\]'

text = '''
Lorem ipsum dolor sit [[amet]], consectetur adipiscing elit,
sed do eiusmod [[time (sample)|tempor]]  incididunt ut [[labore]] et dolore magna aliqua.

Lorem ipsum dolor sit [[amet]], consectetur adipiscing elit, sed do eiusmod [[time (sample)|tempor]]  incididunt ut [[labore]] et dolore magna aliqua.
'''

txt = re.sub(regex, '[[\g<1>]]', text)
print(txt)
Lorem ipsum dolor sit [[amet]], consectetur adipiscing elit,
sed do eiusmod [[tempor]]  incididunt ut [[labore]] et dolore magna aliqua.

Lorem ipsum dolor sit [[amet]], consectetur adipiscing elit, sed do eiusmod [[tempor]]  incididunt ut [[labore]] et dolore magna aliqua.

Regex101 示例在這里

利用

\[\[(?:(?!\[\[)[^|])*\|(.*?)]]

根據要求替換為[[\1]]\1

證明

解釋

--------------------------------------------------------------------------------
  \[                       '['
--------------------------------------------------------------------------------
  \[                       '['
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      \[                       '['
--------------------------------------------------------------------------------
      \[                       '['
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [^|]                     any character except: '|'
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  \|                       '|'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  ]]                       ']]'

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM