簡體   English   中英

Python通過從正則表達式模式拆分來創建字符串元組列表

[英]Python create list of tuples of strings by splitting from regex pattern

假設我有這兩個字符串:

s1 = 'hello 4, this is stackoverflow, looking for help (1345-today is wednesday)'
s2 = 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)'

我想使用正則表達式來匹配模式(number-words) ,然后拆分字符串以獲取元組列表:

final = [('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday'),
         ('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

我試過\\([0-9]+-(.*?)\\)但沒有成功。

我究竟做錯了什么? 有什么想法可以解決嗎?

先感謝您!!

這可能會推動您朝着正確的方向前進:

>>> re.findall(r'^(.*) \((.+?)\)$', s1)
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]

您可以在findall使用此正則表達式:

>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
...     regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

正則表達式詳情:

  • ^(.*?) : 匹配第 1 組開頭的 0 個或多個字符
  • \\s* : 匹配 0 個或多個空格
  • \\((\\d+\\s*-\\s*\\w+[^)]*)\\) :匹配(<number>-word ..)字符串並捕獲捕獲組 #2 中括號內的內容

或者,您可以在split使用此正則表達式:

>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
...     reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']

正則表達式演示

正則表達式詳情:

  • (?<!\\s) : 如果我們之前的位置沒有空格
  • \\s* : 匹配 0+ 個空格
  • (?=\\((\\d+\\s*-\\s*\\w+[^)]*)\\)) :先行聲明我們前面的一個字符串,即(<number>-word ..) 請注意,我們使用捕獲組在split的結果中獲取(...)內的字符串。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM