简体   繁体   中英

regex to extract last occurence of a text between two given strigs

First of all my apologies if something similar was posted. My regex knowledge is very limited and I was unable to find something that I could adapt.

Giving an XML file that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog>

    <include file="init.changelog.xml"/>
    <include file="v9.1.changelog.xml"/>
    <include file="v9.2.changelog.xml"/>
    <include file="v9.3.changelog.xml"/>
    <include file="v9.3.1.changelog.xml"/>
    <include file="v9.3.3.changelog.xml"/>

</databaseChangeLog>

I would like to have a regex that would extract the last version of the change log file. In the example above that would be the string v9.3.3

That regex would need to be java compatible as I need to use it with ant.

Thank you in advance. If you able to help me a few explanations about how it works would be much appreciated.

You can read the file as String then use Pattern and matcher classes, here is an example

    String target = "...<include file=\"init.changelog.xml\"/><include file=\"v9.1.changelog.xml\"/><include file=\"v9.3.3.changelog.xml\"/></databaseChangeLog>...";
    Pattern pattern = Pattern.compile("(v)((\\d\\.)+)|init");
    Matcher matcher = pattern.matcher(target);
    String version = "";
    while (matcher.find())
    {
        version = matcher.group();
        System.out.println(version);
    }
    // use version

Expression (v)((\\\\d\\\\.)+|init) : means match a string consists of letter v followed by integer (\\\\d) followed by dot (\\\\.) and + means one or more

'|' is Or-ing operator so you can match "init" also

when part of the pattern included in two parentheses it means that they form one group, it is good for you to put the pattern in form of groups to make it easy when you want to get one group by itself from the matched string using the pattern matcher

"matcher" will match any part of the string that matches the pattern, matcher.group() get this part matched from the whole string, you can also use matcher.group(i) to get a group from the matched string

for example here matcher.group(2) will bring only the numbers and dots without the letter 'v' and take care that it is 1 indexed where 0 is the whole matched part from the target string, it works the same at matcher.group()

Try the next:

xmlString = xmlString.replace("\r", "").replace("\n", "");
String version = xmlString.replaceAll("^.*(v\\d+(\\.\\d+)*)[^\\d]+$","$1");

这是单线:

String lastVersion = input.replaceAll("(?s).*include file=\"(.*?)\"/>[\n\\s]*</databaseChangeLog", "$1");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM