[英]java regex pattern string format
I am exploring Regular expressions. 我正在探索正则表达式。
Problem statement : Replace String between # and # with the values provided in replacements map . 问题陈述: 用替换映射中提供的值替换#和#之间的字符串 。
import java.util.regex.*;
import java.util.*;
public class RegExTest {
public static void main(String args[]){
HashMap<String,String> replacements = new HashMap<String,String>();
replacements.put("OldString1","NewString1");
replacements.put("OldString2","NewString2");
replacements.put("OldString3","NewString3");
String source = "#OldString1##OldString2#_ABCDEF_#OldString3#";
Pattern pattern = Pattern.compile("\\#(.+?)\\#");
//Pattern pattern = Pattern.compile("\\#\\#");
Matcher matcher = pattern.matcher(source);
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(buffer, "");
buffer.append(replacements.get(matcher.group(1)));
}
matcher.appendTail(buffer);
System.out.println("OLD_String:"+source);
System.out.println("NEW_String:"+buffer.toString());
}
}
Output: ( Caters to my requirement but does not know who group(1) command works) 输出:( 符合我的要求,但不知道谁(1)命令工作)
OLD_String:#OldString1##OldString2#_ABCDEF_#OldString3#
NEW_String:NewString1NewString2_ABCDEF_NewString3
If I change the code as below 如果我更改代码如下
Pattern pattern = Pattern.compile("\\#(.+?)\\#");
with 同
Pattern pattern = Pattern.compile("\\#\\#");
I am getting below error: 我收到以下错误:
Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 1
I did not understand difference between 我不明白之间的区别
"\\#(.+?)\\#" and `"\\#\\#"`
Can you explain the difference? 你能解释一下这个区别吗?
The difference is fairly straightforward - \\\\#(.+?)\\\\#
will match two hashes with one or more chars between them, while \\\\#\\\\#
will match two hashes next to each other. 差别非常简单 - \\\\#(.+?)\\\\#
将匹配两个哈希值,它们之间有一个或多个字符,而\\\\#\\\\#
将匹配彼此相邻的两个哈希值。
A more powerful question, to my mind, is "what is the difference between \\\\#(.+?)\\\\#
and \\\\#.+?\\\\#
?" 在我看来,一个更有力的问题是“ \\\\#(.+?)\\\\#
和\\\\#.+?\\\\#
?”之间的区别是什么?
In this case, what's different is what is (or isn't) getting captured. 在这种情况下,不同之处在于捕获的内容是什么(或不是)。 Brackets in a regex indicate a capture group - basically, some substring you want to output separately from the overall matched string. 正则表达式中的括号表示捕获组 - 基本上,您想要从整个匹配的字符串中单独输出一些子字符串。 In this case, you're capturing the text in between the hashes - the first pattern will capture and output it separately, while the second will not. 在这种情况下,您将捕获散列之间的文本 - 第一个模式将捕获并单独输出,而第二个模式不会。 Try it yourself - asking for matcher.group(1)
on the first will return that text, while the second will produce an exception, even though they both match the same text. 自己尝试 - 请求matcher.group(1)
在第一个将返回该文本,而第二个将产生一个异常,即使它们都匹配相同的文本。
.+? 。+? Tells it to match (one or more of) anything lazily (until it sees a #). 告诉它与懒惰地匹配(一个或多个)任何东西(直到它看到#)。 So as soon as it parses one instance of something, it stops. 因此,只要它解析某个实例,它就会停止。
I think the \\#\\# would match ## so i think the error is because it only matches that one ## and then there's only a group 0, no group 1. But not 100% on that part. 我认为\\#\\#匹配##所以我认为错误是因为它只匹配那个##然后只有一个组0,没有组1.但是那个部分不是100%。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.