![](/img/trans.png)
[英]Writing a function to extract integers from strings using regular expressions
[英]Extract unknown amount of strings from a text using regular expressions
我正在使用正則表達式來從文本中提取某些信息。 例如,一個名字可以由幾個名字和一個姓氏組成(數量未知)。 以下示例提取 2 個字符串:
Name:\s+([\w-äöü]+\s[\w-äöü]+)
如何定義正則表達式以提取未知(!)數量的字符串,直到定義的下一個術語(例如“地址:”)?
用
Name:\s+([\wäöü-]+(?:\s+[\wäöü-]+)*?)(?=\s*Address)
見證明。
解釋
--------------------------------------------------------------------------------
Name: 'Name:'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[\wäöü-]+ any character of: word characters (a-z,
A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the least amount
possible)):
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ")
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[\wäöü-]+ any character of: word characters (a-
z, A-Z, 0-9, _), 'ä', 'ö', 'ü', '-' (1
or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)*? end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
Address 'Address'
--------------------------------------------------------------------------------
) end of look-ahead
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.