使用Java正则表达式提取部分URL

Question

I'm trying to extract part of the URL in the text files. 我正在尝试提取文本文件中的部分URL。

for example: 例如：

/p/gnomecatalog/bugs/search/?q=status%3Aclosed-accepted+or+status%3Awont-fix+or+status%3Aclosed" class="search_bin"><span>Closed Tickets</span></a>

I would like to extract only 我只想提取

 /p/gnomecatalog/bugs/search/?q=status%3Aclosed-accepted+or+status%3Awont-fix+or+status%3Aclosed

HOW I COULD DO THAT BY USING REGULAR Expression. 我如何通过使用常规表达式来做到这一点。 I tried with regex 我尝试过正则表达式

  "/p/*./bugs/*."

but it didn't work. 但这没用。

Answer 1

Try this: 尝试这个：

   "\/p.*\/bugs[^"]*"

it means: "/p" 它表示：“ / p”

then: all chars, 然后：所有字符，

then: "/bugs", 然后：“ / bugs”，

then: all chars except " 然后：除"

Answer 2

You can use : 您可以使用：

(\/p\/.*\/bugs\/.*?(?="))

Java Code : Java代码：

        String REGEX = "(\\/p\\/.*\\/bugs\\/.*?(?=\"))";
        Pattern p = Pattern.compile(REGEX);
        Matcher m = p.matcher(line);
        while (m.find()) {
                String matched = m.group();
                System.out.println("Mached :  "+ matched);

            }

OUTPUT 输出值

Mached :  /p/gnomecatalog/bugs/search/?q=status%3Aclosed-accepted+or+status%3Awont-fix+or+status%3Aclosed

DEMO 演示

Explanation: 说明： 在此处输入图片说明

Answer 3

Here's another way: 这是另一种方式：

(?i)/p/[a-z/]+bugs/[^ "]+

The (?i) in the beginning makes the regex case insensitive so you don't have to worry about that. 开头的（？i）使正则表达式不区分大小写，因此您不必为此担心。 Then after bugs/ it will continue until it reaches either a space or a ". 然后，在bug /之后，它将继续直到到达空格或“。

使用Java正则表达式提取部分URL

问题描述

3 个解决方案

解决方案1
0 2014-01-27 06:33:14

解决方案2
0 2014-01-27 06:38:43

解决方案3
0 2014-01-27 08:00:55

使用Java正则表达式提取部分URL

问题描述

3 个解决方案

解决方案1 0 2014-01-27 06:33:14

解决方案2 0 2014-01-27 06:38:43

解决方案3 0 2014-01-27 08:00:55

解决方案1
0 2014-01-27 06:33:14

解决方案2
0 2014-01-27 06:38:43

解决方案3
0 2014-01-27 08:00:55