簡體   English   中英

匹配此字符串的正則表達式是什么?

[英]What is the regex to match this string?

考慮以下句子:

蘋果是2公斤
蘋果香蕉芒果是2公斤
蘋果蘋果蘋果6公斤
香蕉香蕉香蕉6kg

鑒於“蘋果”,“香蕉”和“芒果”是唯一的水果,提取出現在句子開頭的水果名稱的正則表達式是什么?

我寫了這個正則表達式( https://regex101.com/r/fY8bK1/1 ):

^(apple|mango|banana) is (\d+)kg$  

但這僅在句子中只有一個水果時才匹配。

如何提取所有水果名稱?

所有4個句子的預期輸出應為:

蘋果2
蘋果香蕉芒果2
蘋果蘋果蘋果6
香蕉香蕉香蕉6

您可以像這樣使用分組:

^((?:apple|mango|banana)(?:\s+(?:apple|mango|banana))*) is (\d+)kg$

正則表達式演示

(?:...)是捕獲( (...) )組中的一個非捕獲組,以免在輸出中造成混亂。

((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*)組匹配:

  • (?:apple|mango|banana) -替代列表中以交替符分隔的任何值| 操作員。 如果您打算只匹配整個單詞,請在子模式的兩端放置\\b
  • (?:\\s+(?:apple|mango|banana))*匹配0個或多個序列...
    • \\s+ -1個或多個空格
    • (?:apple|mango|banana) -任何其他選擇。

片段:

 var re = /^((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*) is (\\d+)kg$/gm; var str = 'apple is 2kg\\napple banana mango is 2kg\\napple apple apple is 6kg\\nbanana banana banana is 6kg'; var m; while ((m = re.exec(str)) !== null) { document.write(m[1] + "," + m[2] + "<br/>"); } document.write("<b>appleapple is 2kg</b> matched: " + /^((?:apple|mango|banana)(?:\\s+(?:apple|mango|banana))*) is (\\d+)kg$/.test("appleapple is 2kg")); 

嘗試這個

var re = /^((?:(?:apple|banana|mango)(?= ) ?)+) is (\d+)kg$/gm;

re.exec('apple banana mango is 2kg');
// ["apple banana mango is 2kg", "apple banana mango", "2"]

這與其他答案有何不同? (?= ) ? 在水果選項強制使用空格作為下一個字符之后,除非有更多的水果(否則您將is換成兩倍),否則不會捕獲它。

正則表達式可視化

while循環中使用它可以從多行字符串中獲取所有結果。


此處的gm標志使用re.exec將此RegExp多次應用於同一String ,其中新行匹配$^ 但是, g標志導致str.match表現不同。

如果要對每個字符串進行獨立測試,則可以繼續使用re.exec或刪除這些標志並改用str.match

var re = /^((?:(?:apple|banana|mango)(?= ) ?)+) is (\d+)kg$/; // notice flags gone

'apple banana mango is 2kg'.match(re);
// ["apple banana mango is 2kg", "apple banana mango", "2"]
/^(((apple|mango|banana)\s*)+) is (\d+)kg$/$1,$4/gm

演示: https : //regex101.com/r/sA4aW7/2

因此,您從這里開始,其中之一:

(apple|mango|banana)

讓我們得到最終的空格分隔重復:

(apple|mango|banana)\s*

以及所有(至少一個)重復:

((apple|mango|banana)\s*)+

需要添加一個額外的組,因為您想要一個組來捕獲批次:

(((apple|mango|banana)\s*)+)

加上這一點, $1 (最外面的組)將包含“ banana banana banana ...”; 第四,你的體重。 添加您自己?:避免捕獲內群體,如果你喜歡

^((?:apple|mango|banana| )+) is (\d+)kg\s?$/gmi

演示

https://regex101.com/r/dO1rR7/1


說明

^((?:apple|mango|banana| )+) is (\d+)kg\s?$/gmi

^ assert position at start of a line
1st Capturing group ((?:apple|mango|banana| )+)
    (?:apple|mango|banana| )+ Non-capturing group
        Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
        1st Alternative: apple
            apple matches the characters apple literally (case sensitive)
        2nd Alternative: mango
            mango matches the characters mango literally (case sensitive)
        3rd Alternative: banana
            banana matches the characters banana literally (case sensitive)
        4th Alternative:  
             matches the character  literally
 is matches the characters  is literally (case sensitive)
2nd Capturing group (\d+)
    \d+ match a digit [0-9]
        Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
kg matches the characters kg literally (case sensitive)
\s? match any white space character [\r\n\t\f ]
    Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
$ assert position at end of a line
g modifier: global. All matches (don't return on first match)
m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM