正則表達式模式以匹配某些URL

Question

我有一個很大的文本，我只想使用其中的某些信息。 文本如下所示：

Some random text here
http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.m3u8
More random text here
http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_1_av.m3u8
More random text here
http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_2_av.m3u8
More random text here
http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_3_av.m3u8

我只想要http文字。 文本中有幾個，但我只需要其中之一。 正則表達式應為“以http開頭，以.m3u8結尾”。

我查看了所有不同表達方式的詞匯表，但這對我來說很混亂。 我嘗試了"/^(https?:\\/\\/)?([\\da-z\\.-]+)\\.([az\\.]{12,30})([\\/\\w \\.-]*)*\\/?$/"作為我的模式。 但是夠了嗎？

感謝所有幫助。 謝謝。

Answer 1

假設您的示例中的每個行表示中的文本都是行分隔的，那么以下代碼片段將起作用：

String text = 
"Some random text here" +
System.getProperty("line.separator") +
"http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.m3u8" +
System.getProperty("line.separator") +
"More random text here" +
System.getProperty("line.separator") +
"http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.m3u8" +
System.getProperty("line.separator") +
// removed some for brevity
"More random text here" +
System.getProperty("line.separator") +
// added counter-example ending with "NOPE"
"http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.NOPE";

// Multi-line pattern:
//                           ┌ line starts with http
//                           |    ┌ any 1+ character reluctantly quantified
//                           |    |  ┌ dot escape
//                           |    |  |  ┌ ending text
//                           |    |  |  |   ┌ end of line marker
//                           |    |  |  |   |
Pattern p = Pattern.compile("^http.+?\\.m3u8$", Pattern.MULTILINE);
Matcher m = p.matcher(text);
while (m.find()) {
    System.out.println(m.group());
}

產量

http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.m3u8
http://xxx-f.xxx.net/i/xx/open/xxxx/1370235-005A/EPISOD-1370235-005A-xxx_,892,144,252,360,540,1584,xxxx,.mp4.csmil/index_0_av.m3u8

編輯

對於URL的"index_x"文件的改進的“過濾器”，您只需將其添加到協議和行尾之間的Pattern ，例如：

Pattern.compile("^http.+?index_0.+?\\.m3u8$", Pattern.MULTILINE);

Answer 2

我沒有測試它，但這應該可以解決問題：

^(http:\/\/.*\.m3u8)

Answer 3

這是@capnibishop的答案，但有一點變化。

^(http://).*(/index_1)[^/]*\.m3u8$

在末尾添加了丟失的“ $”符號。 這確保它匹配

http://something.m3u8

並不是

http://something.m3u81

在行尾添加了條件來匹配index_1 ，這意味着它將匹配：

http://something/index_1_something_else.m3u8

並不是

http://something/index_1/something_else.m3u8

正則表達式模式以匹配某些URL

問題描述

3 個解決方案

解決方案1
1 2015-04-27 12:30:55

解決方案2
0 2015-04-27 12:24:46

解決方案3
0 2015-04-27 12:41:09

正則表達式模式以匹配某些URL

問題描述

3 個解決方案

解決方案1 1 2015-04-27 12:30:55

解決方案2 0 2015-04-27 12:24:46

解決方案3 0 2015-04-27 12:41:09

解決方案1
1 2015-04-27 12:30:55

解決方案2
0 2015-04-27 12:24:46

解決方案3
0 2015-04-27 12:41:09