簡體   English   中英

正則表達式,用於在單詞之后和特殊字符之前提取文本,並排除所有其他數字

[英]regex for extracting text after a word and before a special character and to exclude all other numeric

我正在嘗試為給定示例文本編寫一個正則表達式

Section 2.1. 1.1.14. Minimum Rent Schedule (subiect to adjustment, if applicable):less than or greater than twelve (12) full calendar months (and such proration or adjustment being based upon the actual number of days in such Lease Year)

期望的輸出

Minimum Rent Schedule (subiect to adjustment, if applicable)

單詞'Section'和最高字符':' 但是就像在這里一樣,我不希望它用數字捕捉任何東西。

我到目前為止一直在嘗試的是

[Section]+.*[:]

這是一種模式。

例如:

import re

s = "Section 2.1. 1.1.14. Minimum Rent Schedule (subiect to adjustment, if applicable):less than or greater than twelve (12) full calendar months (and such proration or adjustment being based upon the actual number of days in such Lease Year)"
print(re.match(r"Section[\d.\s]+(.*?):", s).group(1))

輸出:

Minimum Rent Schedule (subiect to adjustment, if applicable)

如果您有多個元素,請使用re.findall

例如:

print(re.findall(r"Section[\d.\s]+(.*?):", your_text))

您嘗試過的模式使用的字符類將匹配列出的任何字符1次以上。

要不匹配在Section之后的任何包含數字的內容,您可以重復0+次以匹配一個空格,后跟一個至少包含一個數字的非空白字符。

捕獲隨后在組中不包含數字的內容。

Section (?:[^\s\d]*\d\S* )*([^:]+):

說明

  • Section匹配分區和空間
  • (?:非捕獲組
    • [^\\s\\d]*使用否定的字符類匹配除空格字符和數字0+次以外的所有字符
    • \\d\\S*然后匹配一個數字,然后匹配0+次非空格字符
  • )*關閉群組並重復0次以上
  • ([^:]+):捕捉在第1個匹配1+倍任何字符除了一個: ,則匹配:

正則表達式演示

例如

import re

regex = r"Section (?:[^\s\d]*\d\S* )*([^:]+):"
s = "Section 2.1. 1.1.14. Minimum Rent Schedule (subiect to adjustment, if applicable):less than or greater than twelve (12) full calendar months (and such proration or adjustment being based upon the actual number of days in such Lease Year)"
print(re.match(regex, s).group(1))

結果

最低租金時間表(可調整)

要查找多個,可以使用re.findall:

print(re.findall(regex, s))

使用re.findall進行演示

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM