简体   繁体   English

使用正则表达式获取两个字符串之间的子字符串

[英]get substring between two string using regex

I have xml files containing code like 我有包含如下代码的xml文件

<bean id="ParentDataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource">
    <property name="driverClass" value="${JDBC.MYSQL.DRIVER}" />
    <property name="password" value="${JDBC.MYSQL.PASSWORD}" />
    <property name="user" value="${JDBC.MYSQL.USERNAME}" />
</bean>

I want to get all tokens between value={ and } using java code. 我想使用Java代码获取value = {和}之间的所有标记。 for above text needs below output : 对于上面的文本需要下面的输出:

JDBC.MYSQL.DRIVER 
JDBC.MYSQL.PASSWORD
JDBC.MYSQL.USERNAME

I tried with following code but could not able to add $ symbol in regex. 我尝试使用以下代码,但无法在正则表达式中添加$符号。

BufferedReader reader = new BufferedReader(new FileReader(file));
Pattern pattern = Pattern.compile("value=\"$(.*?)}");
String line;
while((line=reader.readLine())!=null) {
    Matcher matcher = pattern.matcher(line);
    System.out.println(matcher.group(1));
}

Please suggest some solution. 请提出一些解决方案。

This does look an awful lot of spring, which has build in methods to replace ${var_name} with a value taken from a .properties file: 这看起来确实令人费解,它具有内置方法来将$ {var_name}替换为.properties文件中的值:

<bean class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
    <property name="location" value="database.properties"/>
</bean>

This nice snippet while replace the ${var_name} in your XML files with the actual value when loading the application context. 在加载应用程序上下文时,用您的XML文件中的$ {var_name}替换为实际值时,这个漂亮的代码片段。

Try this:

String input = "abcabc pattern1foopattern2 abcdefg pattern1barpattern2 morestuff";
    List<String> strings = Arrays.asList( input.replaceAll("^.*?pattern1", "").split("pattern2.*?(pattern1|$)"));
    System.out.println( strings);

i would recommend using an XML parser. 我建议使用XML解析器。

Try this 尝试这个

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.NamedNodeMap;

public class Demo {

public static void main(String[] args) throws Exception {

  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  DocumentBuilder db = dbf.newDocumentBuilder();
  Document document = db.parse(new File("sample.xml"));
  NodeList propertyList = document.getElementsByTagName("property");

  if(propertyList !=null && propertyList.getLength() > 0) {
    for(int i =0; i< propertyList.getLength(); i++) {
      NamedNodeMap node = propertyList.item(i).getAttributes();
      System.out.println(node.getNamedItem("value").getNodeValue());
    }
  }
}
}

There are several problems with your regex. 正则表达式有几个问题。 First of all, you have missed the { character. 首先,您错过了{字符。 Secondly, you need to escape both the { , } and $ characters since they are special characters. 其次,您需要转义{}$字符,因为它们是特殊字符。 So, this will work: 因此,这将起作用:

Pattern pattern = Pattern.compile("value=\"\\$\\{(.*?)\\}");

Also, you will need to call matcher.find() : 另外,您将需要调用matcher.find()

if (matcher.find()) {    
    System.out.println(matcher.group(1));
}

A few issues: you need to escape some of the special characters in the regexp, and most importantly, you need to call a method on the matcher, such as find() , to get it to do the matching. 有几个问题:您需要转义正则表达式中的一些特殊字符,最重要的是,需要在匹配器上调用一个方法,例如find() ,以使其进行匹配。 Try this: 尝试这个:

    Pattern pattern = Pattern.compile("\\$\\{(.*)\\}");
    Matcher matcher = pattern.matcher(s);
    if (matcher.find()) {
        System.out.println(matcher.group(1));
    } else {
        System.out.println("not found");
    }

Note that the regexp requires $ and { and } to be escaped with a backslash. 请注意,正则表达式要求$和{和}以反斜杠转义。 And since you want an actual backslash in the string, you have to provide a double backslash. 而且,由于要在字符串中使用实际的反斜杠,因此必须提供双反斜杠。

This seems to work, as long as there's only one occurrence of ${...} on a line. 只要在一行中只出现${...} ,这似乎可行。 If there are two, the match group will match from the start of the first to the end of the second! 如果有两个,则匹配组将从第一个开始到第二个结束进行匹配! For example, if the input is 例如,如果输入是

    abc${FOO}def${BAR}ghi

the group will match FOO}def${BAR which is probably not what you want. 该组将匹配FOO}def${BAR ,这可能不是您想要的。 If you need to handle this, you can use something like 如果您需要处理此问题,可以使用类似

    Pattern pattern = Pattern.compile("\\$\\{([^}]*)\\}");

This avoids the spanning problem. 这避免了跨越问题。 However, this matches only the first occurrence. 但是,这仅匹配第一次出现。 Do deal with multiple matches, you have to write an inner loop that starts the next match at the end of the previous one. 要处理多个匹配项,您必须编写一个内部循环,在上一个匹配项的末尾开始下一个匹配项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM