简体   繁体   中英

How to extract a sequence of 7 numbers from a String in Java?

Say I have some String object containing "This sentence was written on 2020-03-21 by person 1234567 at 07:23 hours" . How would I extract ONLY the "1234567" part of the string? Maybe using a solution from this Extract digits from string - StringUtils Java question, but I don't know how to limit the extracted numbers only on the desired sequence.

If I would use the str.replaceAll("[^0-9]", "") on this string, I would get "2020032112345670723" which means that it extracts ALL of the numbers in a string, but I want ONLY the sequence containing a certain number of digits (in my case 7).

Also, the sequence will not always be in the same place, so using substring(index from, index to) will not work.

I would probably do that using a regular expression . For seven adjacenct digits that would be \d{7} or even better \b\d{7}\b (thanks @AlexRudenko).

To do so you might wanna use the Pattern API:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

// ...

Pattern digitPattern = Pattern.compile("\\b\\d{7}\\b");
Matcher m = digitPattern.matcher(<your-string-here>);
while (m.find()) {
    String s = m.group();
    // prints just your 7 digits
    System.out.println(s);
}

I just verified it and it's working fine.

(Pattern extraction taken from this answer

Assuming that the number of digits is not always 7, I would use the regular expression

" ([0-9]+) "

The inner part [0-9]+ find one or more digits. The spaces left and right of it ensure that the number is only found if surrounded by spaces, so the dates and times in your input string are ignored. The parentheses are used in combination with group(1) to return only the number without spaces around it.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main
{

    private static final Pattern regexp=Pattern.compile(" ([0-9]+) ");

    public static void main(String[] args)
    {
        String s="This sentence was written on 2020-03-21 by person 1234567 at 07:23 hours";
        Matcher matcher=regexp.matcher(s);
        if (matcher.find())
        {
            String number=matcher.group(1);
            System.out.printf("number=%s",number);
        }
    }
}

To find only numbers with 5 - 8 digits, you could write " ([0-9]{5,8}) "

As other wrote in the meantime, \\d may be used as an alternative to [0-9] .

You can do a simple linear search to find the numeric substring of length 7:

public static void main(String[] args) {
        String str = "This sentence was written on 2020-03-21 by person 1234567 at 07:23 hours";
        System.out.println(getNumber(str));
}
private static String getNumber(String str) {
        String number = null;
        if(str != null)
            for(String s : str.split(" "))
                if(s.length() == 7 && isNumeric(s))
                    number = s;
        return number;
}
private static boolean isNumeric(String str) { 
        try {  
              Integer.parseInt(str);  
              return true;
        } catch(NumberFormatException e){  
              return false;  
        }  
}

Output:

1234567

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM