简体   繁体   中英

Java Regex: getting multiple parts of a string as substrings

Based on a bunch of replies here: Capturing Part of String and Using Java to find substring of a bigger string using Regular Expression

I'm trying to get multiple parts of a string as substrings:

import java.io.File;
import java.util.regex.*;

String subject = "Re: New Mail File Alert: MAIL_20140320_0000000002.dat XYZ";

Pattern p = Pattern.compile("^(Re|Fwd?)(.*)(New Mail File Alert: )(MAIL_)((20)\\d\\d)(0[1-9]|1[012])(0[1-9]|[12][0-9]|3[01])[_](\\d{10})(\\.)(dat|ctl)(\\s)(XYZ|ABC)$");
Matcher m = p.matcher(subject);

if (m.find()){
   String file = subject.replaceAll("(MAIL_)((20)\\d\\d)(0[1-9]|1[012])(0[1-9]|[12][0-9]|3[01])[_](\\d{10})", "$1");
   String path = subject.replaceAll("(XYZ|ABC)$", "$1");
}

The bits I want are: "MAIL_20140320_0000000002" and "XYZ" however the strings I'm getting back are:

file: Re: New Mail File Alert: MAIL_.dat XYZ
path: Re: New Mail File Alert: MAIL_20140320_0000000002.dat XYZ

Can anyone see what I'm doing wrong here?

The following works

New Mail File Alert: ([^.]*)\\\\.dat (.*)$

This matches two groups:

 1. MAIL_20140320_0000000002
 2. XYZ
import java.io.File;
import java.util.regex.*;   

String subject = "Re: New Mail File Alert: MAIL_20140320_0000000002.dat XYZ"; 

String file = null;
String path = null;
    try {
        Pattern regex = Pattern.compile("(MAIL[\\d_]+).*?\\s+(.*?)$", Pattern.DOTALL | Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
        Matcher regexMatcher = regex.matcher(subject);
        try {
        file = regexMatcher.replaceAll("$1");
        path = regexMatcher.replaceAll("$2");
        } catch (IllegalArgumentException ex) {
            // Syntax error in the replacement text (unescaped $ signs?)
        } catch (IndexOutOfBoundsException ex) {
            // Non-existent backreference used the replacement text
        } 
    } catch (PatternSyntaxException ex) {
        // Syntax error in the regular expression
    }

I see no reason to use regex to replace any parts in the matched String . Just extract the values from the corresponding Matcher groups.

String file = m.group(4) + m.group(5) + m.group(7) + m.group(8)
                + "_" + m.group(9);
String path = m.group(13);

System.out.println(file);
System.out.println(path);

prints

MAIL_20140320_0000000002
XYZ

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM