[英]Java regex pattern matcher
I have a string of the following format: 我有以下格式的字符串:
String name = "A|DescA+B|DescB+C|DescC+...X|DescX+"
So the repeating pattern is ?|?+ , and I don't know how many there will be. 所以重复模式是?|?+ ,我不知道会有多少个。 The part I want to extract is the part before |...so for my example I want to extract a list (an ArrayList for example) that will contain:
我要提取的部分是| ...之前的部分,因此对于我的示例,我要提取一个包含以下内容的列表(例如ArrayList):
[A, B, C, ... X]
I have tried the following pattern: 我尝试了以下模式:
(.+)\\|.*\\+
but that doesn't work the way I want it to? 但这不符合我想要的方式吗? Any suggestions?
有什么建议么?
To convert this into a list you can do like this: 要将其转换为列表,您可以执行以下操作:
String name = "A|DescA+B|DescB+C|DescC+X|DescX+";
Matcher m = Pattern.compile("([^|]+)\\|.*?\\+").matcher(name);
List<String> matches = new ArrayList<String>();
while (m.find()) {
matches.add(m.group(1));
}
This gives you the list: 这给出了列表:
[A, B, C, X]
Note the ?
注意
?
in the middle, that prevents the second part of the regex to consume the entire string, since it makes the *
lazy instead of greedy . 在中间,这样可以防止正则表达式的第二部分占用整个字符串,因为它使
*
惰性,而不是greedy 。
You are consuming any character ( .
) and that includes the |
您正在使用任何字符(
.
),其中包括|
so, the parser goes on munching everything, and once it's done taking any char, it looks for |
因此,解析器会继续用力嚼所有内容,一旦完成获取任何字符的操作,它就会寻找
|
, but there's nothing left. ,但是什么都没有了。
So, try to match any character but |
因此,尝试匹配除
|
任何字符|
like this: 像这样:
"([^|]+)\\|.*\\+"
And if it fits, make sure your all-but-| 并且如果适合,请确保您的所有| is at the beginning of the string using
^
and that there's a + at the end of the string with $
: 在使用
^
的字符串的开头,在使用$
的字符串的结尾有一个+:
"^([^|]+)\\|.*\\+$"
UPDATE: Tim Pietzcker makes a good point: since you are already matching until you find a |
更新:蒂姆·皮茨克(Tim Pietzcker)提出了一个很好的观点:由于您已经匹配了,直到找到
|
, you could just as well match the rest of the string and be done with it: ,您也可以匹配其余字符串并完成操作:
"^([^|]+).*\\+$"
UPDATE2: By the way, if you want to simply get the first part of the string, you can simplify things with: UPDATE2:顺便说一句,如果您只想获取字符串的第一部分,则可以使用以下方法简化操作:
myString.split("\\|")[0]
Another idea: Find all characters between +
(or start of string) and |
另一个想法:查找
+
(或字符串开头)和|
之间的所有字符|
: :
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("(?<=^|[+])[^|]+");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
我认为最简单的解决方案是用\\\\+
分割,然后对每个部分应用(.+?)\\\\|.*
模式提取所需的组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.