简体   繁体   English

使用 Regex 和 Java 提取基于百分比符号的编码字符串

[英]Extract encoded strings based on percentage symbol with Regex and Java

I am trying to to detect/match encoded chars starting with % .我正在尝试检测/匹配以%开头的编码字符。

My Regex is ([%][2-9|AF][0-9A-F]{1,2})+我的正则表达式是([%][2-9|AF][0-9A-F]{1,2})+

On regexr.com it works and it matched what I need.在 regexr.com 上它可以工作并且符合我的需要。

I used these strings for tests: caf%C3%A9+100%+noir%C20 and test%C3%A9+%C3%A0+100%我使用这些字符串进行测试: caf%C3%A9+100%+noir%C20test%C3%A9+%C3%A0+100%

In my Java code it is returning only the first group.在我的 Java 代码中,它只返回第一组。

String pattern = "([%][2-9|A-F][0-9A-F]{1,2})+";
Matcher matcher = Pattern.compile(pattern ).matcher(input);
if (matcher.find()) {
  for (int i = 0; i < matcher.groupCount(); i++) {
    System.out.println(matcher.group(i));
  }
}

And the output for caf%C3%A9+100%+noir%C20 is %C3%A9 and not %C3%A9 + %C20 .而 caf caf%C3%A9+100%+noir%C20的 output 是%C3%A9而不是%C3%A9 + %C20

For test%C3%A9+%C3%A0+100% is %C3%A9 and not %C3%A9 + %C3%A0对于test%C3%A9+%C3%A0+100%%C3%A9而不是%C3%A9 + %C3%A0

The Regex you are using is overly complicated.您使用的正则表达式过于复杂。 Also, the way you are trying to print all the matches doesn't work.此外,您尝试打印所有匹配项的方式不起作用。 Try this:尝试这个:

String input = "caf%C3%A9+100%+noir%C20";
String pattern = "(?:%[2-9A-F][0-9A-F]{1,2})+";
Matcher matcher = Pattern.compile(pattern ).matcher(input);

while (matcher.find()) {
    System.out.println(matcher.group());
}

This prints:这打印:

%C3%A9
%C20

Based on @41686d6564 comment, the solution is to use a while loop and group(0) :根据@41686d6564 评论,解决方案是使用while循环和group(0)

String pattern = "([%][2-9A-F][0-9A-F]{1,2})+"; 
Matcher matcher = Pattern.compile(pattern).matcher(input);
while (matcher.find()) {
  System.out.println(matcher.group(0));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM