捕获正则表达式在URL中的斜线之间重复的字符串

Question

I have following partial URL that can be 我有以下部分网址，可以是

/it/xyz /test/param+1/param-2/1234/gfd4 / it / xyz / test / param + 1 / param-2 / 1234 / gfd4

Basically two letter at the beginning a slash another unknown string and then a series of repeatable strings between slashes I need to capture every string (I know a split with / delimiter would be fine but I am interested to know how can I extract with regex). 基本上在开头两个字母是一个斜杠，另一个是未知字符串，然后是一系列在斜杠之间的可重复字符串，我需要捕获每个字符串（我知道用/分隔符进行拆分会很好，但是我很想知道如何使用正则表达式提取）。 I came out first with this: 我首先出来的是这样的：

^\/([a-zA-Z]{2})\/([a-zA-Z]{1,10})(\/[a-zA-Z1-9\+\-]+)

but it only capture 但它只能捕捉

group1: it group2: xyz group3: /test group1：it group2：xyz group3：/ test

and of course it ignores the rest of the string. 当然，它会忽略字符串的其余部分。

If I add a * sign at the end it only captures the last sentence: 如果我在末尾加一个*号，它只会捕获最后一个句子：

^\/([a-zA-Z]{2})\/([a-zA-Z]{1,10})(\/[a-zA-Z1-9\+\-]+)*

group1: it group2: xyz group3: /gfd4 group1：it group2：xyz group3：/ gfd4

So, I am obviously missing some fundamentals, so in addition to the proper regex I would like to have an explanation. 因此，我显然缺少一些基本知识，因此除了适当的正则表达式外，我还想解释一下。

I tagged as Java because the engine which parses the regex is the JDK 7. It is my knowledge that each engine may have differences. 我标记为Java是因为解析正则表达式的引擎是JDK7。据我所知，每个引擎可能会有差异。

Answer 1

As mentioned here , this is expected: 如此处所述，这是预期的：

With one group in the pattern, you can only get one exact result in that group. 模式中只有一组，您只能在该组中获得一个准确的结果。
If your capture group gets repeated by the pattern (you used the + quantifier on the surrounding non-capturing group), only the last value that matches it gets stored. 如果您的捕获组被模式重复（您在周围的非捕获组上使用了+量词），则只会存储与其匹配的最后一个值。

I would rather capture the rest of the string in group3 ( (\\/.*$) , as in this demo ), then use a split around '/'. 我宁愿在group3（ (\\/.*$)捕获字符串的其余部分，如本演示中所示），然后在'/'周围使用拆分。 Or apply yhat pattern on the rest of the string: 或在字符串的其余部分上应用yhat模式：

Pattern p = Pattern.compile("(\/[a-zA-Z1-9\+\-]+)");
Matcher m = p.matcher(str);
while (m.find()) {
    String place = m.group(1);
    ...
}

捕获正则表达式在URL中的斜线之间重复的字符串

问题描述

1 个解决方案

解决方案1
0 2017-10-29 08:46:47

捕获正则表达式在URL中的斜线之间重复的字符串

问题描述

1 个解决方案

解决方案1 0 2017-10-29 08:46:47

解决方案1
0 2017-10-29 08:46:47