简体   繁体   English

如何在正则表达式匹配中排除“ <”

[英]how to exclude “<” in regex match

I have a String which looks like "<name><address> and <Phone_1>" . 我有一个看起来像"<name><address> and <Phone_1>" I have get to get the result like 我必须得到这样的结果

1) <name>
2) <address>
3) <Phone_1>

I have tried using regex "<(.*)>" but it returns just one result. 我尝试使用正则表达式“ <(。*)>”,但它仅返回一个结果。

The regex you want is 您想要的正则表达式是

<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>

Which will then spit out the stuff you want in the 3 capture groups. 然后,这将吐出您想要的3个捕获组中的内容。 The full code would then look something like this: 完整的代码将如下所示:

Matcher m = Pattern.compile("<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>").matcher(string);

if (m.find()) {
    String name = m.group(1);
    String address = m.group(2);
    String phone = m.group(3);
}

The pattern .* in a regex is greedy . 正则表达式中的模式.*贪婪的 It will match as many characters as possible between the first < it finds and the last possible > it can find. 它会在找到的第一个<和最后一个可能的>之间匹配尽可能多的字符。 In the case of your string it finds the first < , then looks for as much text as possible until a > , which it will find at the very end of the string. 对于您的字符串,它会找到第一个< ,然后查找尽可能多的文本,直到找到>为止,它将在字符串的最后找到。

You want a non-greedy or "lazy" pattern, which will match as few characters as possible. 您需要一个非贪婪或“惰性”模式,该模式将匹配尽可能少的字符。 Simply <(.+?)> . 只需<(.+?)> The question mark is the syntax for non-greedy. 问号是非贪婪的语法。 See also this question . 另请参阅此问题

This will work if you have dynamic number of groups. 如果您有动态的组数,这将起作用。

Pattern p = Pattern.compile("(<\\w+>)");
Matcher m = p.matcher("<name><address> and <Phone_1>");
while (m.find()) {
    System.out.println(m.group());
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM