简体   繁体   English

java正则表达式模式字符串格式

[英]java regex pattern string format

I am exploring Regular expressions. 我正在探索正则表达式。

Problem statement : Replace String between # and # with the values provided in replacements map . 问题陈述: 用替换映射中提供的值替换#和#之间的字符串

import java.util.regex.*;
import java.util.*;

public class RegExTest {
    public static void main(String args[]){

        HashMap<String,String> replacements = new HashMap<String,String>();
        replacements.put("OldString1","NewString1");
        replacements.put("OldString2","NewString2");
        replacements.put("OldString3","NewString3");

        String source = "#OldString1##OldString2#_ABCDEF_#OldString3#";

        Pattern pattern = Pattern.compile("\\#(.+?)\\#");
        //Pattern pattern = Pattern.compile("\\#\\#");
        Matcher matcher = pattern.matcher(source);
        StringBuffer buffer = new StringBuffer();
        while (matcher.find()) {
            matcher.appendReplacement(buffer, "");
            buffer.append(replacements.get(matcher.group(1)));            
        }
        matcher.appendTail(buffer);
        System.out.println("OLD_String:"+source);
        System.out.println("NEW_String:"+buffer.toString());

    }
}

Output: ( Caters to my requirement but does not know who group(1) command works) 输出:( 符合我的要求,但不知道谁(1)命令工作)

OLD_String:#OldString1##OldString2#_ABCDEF_#OldString3#
NEW_String:NewString1NewString2_ABCDEF_NewString3

If I change the code as below 如果我更改代码如下

Pattern pattern = Pattern.compile("\\#(.+?)\\#");

with

Pattern pattern = Pattern.compile("\\#\\#");

I am getting below error: 我收到以下错误:

Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 1

I did not understand difference between 我不明白之间的区别

"\\#(.+?)\\#" and `"\\#\\#"`

Can you explain the difference? 你能解释一下这个区别吗?

The difference is fairly straightforward - \\\\#(.+?)\\\\# will match two hashes with one or more chars between them, while \\\\#\\\\# will match two hashes next to each other. 差别非常简单 - \\\\#(.+?)\\\\#将匹配两个哈希值,它们之间有一个或多个字符,而\\\\#\\\\#将匹配彼此相邻的两个哈希值。

A more powerful question, to my mind, is "what is the difference between \\\\#(.+?)\\\\# and \\\\#.+?\\\\# ?" 在我看来,一个更有力的问题是“ \\\\#(.+?)\\\\#\\\\#.+?\\\\# ?”之间的区别是什么?

In this case, what's different is what is (or isn't) getting captured. 在这种情况下,不同之处在于捕获的内容是什么(或不是)。 Brackets in a regex indicate a capture group - basically, some substring you want to output separately from the overall matched string. 正则表达式中的括号表示捕获组 - 基本上,您想要从整个匹配的字符串中单独输出一些子字符串。 In this case, you're capturing the text in between the hashes - the first pattern will capture and output it separately, while the second will not. 在这种情况下,您将捕获散列之间的文本 - 第一个模式将捕获并单独输出,而第二个模式不会。 Try it yourself - asking for matcher.group(1) on the first will return that text, while the second will produce an exception, even though they both match the same text. 自己尝试 - 请求matcher.group(1)在第一个将返回该文本,而第二个将产生一个异常,即使它们都匹配相同的文本。

.+? 。+? Tells it to match (one or more of) anything lazily (until it sees a #). 告诉它与懒惰地匹配(一个或多个)任何东西(直到它看到#)。 So as soon as it parses one instance of something, it stops. 因此,只要它解析某个实例,它就会停止。

I think the \\#\\# would match ## so i think the error is because it only matches that one ## and then there's only a group 0, no group 1. But not 100% on that part. 我认为\\#\\#匹配##所以我认为错误是因为它只匹配那个##然后只有一个组0,没有组1.但是那个部分不是100%。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM