简体   繁体   中英

Regex for numeric portion of Java string

I'm trying to write a Java method that will take a string as a parameter and return another string if it matches a pattern, and null otherwise. The pattern:

  • Starts with a number (1+ digits); then followed by
  • A colon (" : "); then followed by
  • A single whitespace (" "); then followed by
  • Any Java string of 1+ characters

Hence, some valid string thats match this pattern:

50: hello
1: d
10938484: 394958558

And some strings that do not match this pattern:

korfed49
: e4949
6
6:
6:sdjjd4

The general skeleton of the method is this:

public String extractNumber(String toMatch) {
    // If toMatch matches the pattern, extract the first number
    // (everything prior to the colon).

    // Else, return null.
}

Here's my best attempt so far, but I know I'm wrong:

public String extractNumber(String toMatch) {
    // If toMatch matches the pattern, extract the first number
    // (everything prior to the colon).
    String regex = "???";
    if(toMatch.matches(regex))
        return toMatch.substring(0, toMatch.indexOf(":"));

    // Else, return null.
    return null;
}

Thanks in advance.

Your description is spot on, now it just needs to be translated to a regex:

^      # Starts
\d+    # with a number (1+ digits); then followed by
:      # A colon (":"); then followed by
       # A single whitespace (" "); then followed by
\w+    # Any word character, one one more times
$      # (followed by the end of input)

Giving, in a Java string:

"^\\d+: \\w+$"

You also want to capture the numbers: put parentheses around \\d+ , use a Matcher , and capture group 1 if there is a match:

private static final Pattern PATTERN = Pattern.compile("^(\\d+): \\w+$");

// ...

public String extractNumber(String toMatch) {
    Matcher m = PATTERN.matcher(toMatch);
    return m.find() ? m.group(1) : null;
}

Note: in Java, \\w only matches ASCII characters and digits (this is not the case for .NET languages for instance) and it will also match an underscore. If you don't want the underscore, you can use (Java specific syntax):

[\w&&[^_]]

instead of \\w for the last part of the regex, giving:

"^(\\d+): [\\w&&[^_]]+$"

尝试使用以下命令:\\ d +:\\ w +

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM