[英]What is the regex to match this string?
考慮以下句子:
蘋果是2公斤
蘋果香蕉芒果是2公斤
蘋果蘋果蘋果6公斤
香蕉香蕉香蕉6kg
鑒於“蘋果”,“香蕉”和“芒果”是唯一的水果,提取出現在句子開頭的水果名稱的正則表達式是什么?
我寫了這個正則表達式( https://regex101.com/r/fY8bK1/1 ):
^(apple|mango|banana) is (\d+)kg$
但這僅在句子中只有一個水果時才匹配。
如何提取所有水果名稱?
所有4個句子的預期輸出應為:
蘋果2
蘋果香蕉芒果2
蘋果蘋果蘋果6
香蕉香蕉香蕉6
您可以像這樣使用分組:
^((?:apple|mango|banana)(?:\s+(?:apple|mango|banana))*) is (\d+)kg$
(?:...)
是捕獲( (...)
)組中的一個非捕獲組,以免在輸出中造成混亂。
((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*)
組匹配:
(?:apple|mango|banana)
-替代列表中以交替符分隔的任何值|
操作員。 如果您打算只匹配整個單詞,請在子模式的兩端放置\\b
。 (?:\\s+(?:apple|mango|banana))*
匹配0個或多個序列...
\\s+
-1個或多個空格 (?:apple|mango|banana)
-任何其他選擇。 片段:
var re = /^((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*) is (\\d+)kg$/gm; var str = 'apple is 2kg\\napple banana mango is 2kg\\napple apple apple is 6kg\\nbanana banana banana is 6kg'; var m; while ((m = re.exec(str)) !== null) { document.write(m[1] + "," + m[2] + "<br/>"); } document.write("<b>appleapple is 2kg</b> matched: " + /^((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*) is (\\d+)kg$/.test("appleapple is 2kg"));
嘗試這個
var re = /^((?:(?:apple|banana|mango)(?= ) ?)+) is (\d+)kg$/gm;
re.exec('apple banana mango is 2kg');
// ["apple banana mango is 2kg", "apple banana mango", "2"]
這與其他答案有何不同? (?= ) ?
在水果選項強制使用空格作為下一個字符之后,除非有更多的水果(否則您將is換成兩倍),否則不會捕獲它。
在while
循環中使用它可以從多行字符串中獲取所有結果。
此處的gm
標志使用re.exec
將此RegExp多次應用於同一String ,其中新行匹配$^
。 但是, g
標志導致str.match
表現不同。
如果要對每個字符串進行獨立測試,則可以繼續使用re.exec
或刪除這些標志並改用str.match
var re = /^((?:(?:apple|banana|mango)(?= ) ?)+) is (\d+)kg$/; // notice flags gone
'apple banana mango is 2kg'.match(re);
// ["apple banana mango is 2kg", "apple banana mango", "2"]
/^(((apple|mango|banana)\s*)+) is (\d+)kg$/$1,$4/gm
演示: https : //regex101.com/r/sA4aW7/2
因此,您從這里開始,其中之一:
(apple|mango|banana)
讓我們得到最終的空格分隔重復:
(apple|mango|banana)\s*
以及所有(至少一個)重復:
((apple|mango|banana)\s*)+
需要添加一個額外的組,因為您想要一個組來捕獲批次:
(((apple|mango|banana)\s*)+)
加上這一點, $1
(最外面的組)將包含“ banana banana banana ...”; 第四,你的體重。 添加您自己?:
避免捕獲內群體,如果你喜歡 。
^((?:apple|mango|banana| )+) is (\d+)kg\s?$/gmi
演示
https://regex101.com/r/dO1rR7/1
說明
^((?:apple|mango|banana| )+) is (\d+)kg\s?$/gmi
^ assert position at start of a line
1st Capturing group ((?:apple|mango|banana| )+)
(?:apple|mango|banana| )+ Non-capturing group
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
1st Alternative: apple
apple matches the characters apple literally (case sensitive)
2nd Alternative: mango
mango matches the characters mango literally (case sensitive)
3rd Alternative: banana
banana matches the characters banana literally (case sensitive)
4th Alternative:
matches the character literally
is matches the characters is literally (case sensitive)
2nd Capturing group (\d+)
\d+ match a digit [0-9]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
kg matches the characters kg literally (case sensitive)
\s? match any white space character [\r\n\t\f ]
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
$ assert position at end of a line
g modifier: global. All matches (don't return on first match)
m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.