简体   繁体   中英

How to extract 3 numbers from a String

I have an array of strings that contains the following:

"1 years, 2 months, 22 days",
"1 years, 1 months, 14 days",
"4 years, 24 days",
"13 years, 21 days",
"9 months, 1 day";

I need to extract the amount of years,months, days of each item in the list.

What I have tried and failed:

String[] split = duracao.split(",");

if (split.length >= 3) {

    anos = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
    meses = Integer.parseInt(split[1].replaceAll("[^-?0-9]+", ""));
    dias = Integer.parseInt(split[2].replaceAll("[^-?0-9]+", ""));
} else if (split.length >= 2) {

    meses = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
    dias = Integer.parseInt(split[1].replaceAll("[^-?0-9]+", ""));
} else if (split.length >= 1) {

    dias = Integer.parseInt(split[0].replaceAll("[^-?0-9]+", ""));
}

It doesnt work because sometimes the first item in the String is years, and sometimes its months.

Is it possible to use regex to achieve what I want? To deal with the "plurarism", I can do :

duration = duration.replace("months", "month");
duration = duration.replace("days", "day");
duration = duration.replace("years", "year");

But now How do I extract the data I need?

I would simply use a switch block for this.

int years = 0, months = 0, days = 0;

String[] fields = s1.split(", +");
for (String field : fields) {
    String[] parts = field.split(" ");
    int value = Integer.parseInt(parts[0]);

    switch (parts[1]) {
        case "year":
        case "years":
            years = value;
            break;
        case "month":
        case "months":
            months = value;
            break;
        case "day":
        case "days":
            days = value;
            break;
        default:
            throw new IllegalArgumentException("Unknown time unit: " + parts[1]);
    }
}

I would suggest a regex approach to find each part one by one, as the order may vary

static void parse(String value) {
    int year = 0, month = 0, day = 0;
    Matcher m;
    if ((m = year_ptn.matcher(value)).find())
        year = Integer.parseInt(m.group(1));
    if ((m = month_ptn.matcher(value)).find())
        month = Integer.parseInt(m.group(1));
    if ((m = day_ptn.matcher(value)).find())
        day = Integer.parseInt(m.group(1));

    System.out.format("y=%2s  m=%2s  d=%2s\n", year, month, day);
}

static Pattern year_ptn = Pattern.compile("(\\d+)\\s+year");
static Pattern month_ptn = Pattern.compile("(\\d+)\\s+month");
static Pattern day_ptn = Pattern.compile("(\\d+)\\s+day");

public static void main(String[] args) {
    List<String> values = Arrays.asList("1 years, 2 months, 22 days", "1 years, 1 months, 14 days",
            "1 months, 1 years, 14 days", "4 years, 24 days", "13 years, 21 days", "9 months, 1 day");

    for (String s : values) {
        parse(s);
    }
}
y= 1  m= 2  d=22
y= 1  m= 1  d=14
y= 1  m= 1  d=14
y= 4  m= 0  d=24
y=13  m= 0  d=21
y= 0  m= 9  d= 1

Not really a java programmer but, you can just iterate over the string and do the following. Int iterator =0; While(iterator!=str.length) 1.Create a new empty string.

2.If the current char is a number add it to the string advance the iterator and Repeat phase 2.

  1. else if the current char is not a number do the following. 3.1. Convert the string you created to a number 3.2 if the current char is 'y' its the years, 'm' months 'd' days.

  2. Move the iterator to the next first number position.

You should have all the numbers corrolated to the years,months,days

Solution using java.time API:

I recommend you usejava.time.Period which is modelled on ISO-8601 standards and was introduced with Java-8 as part of JSR-310 implementation .

Demo:

import java.time.Period;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        String[] arr = { "1 years, 2 months, 22 days", "1 years, 1 months, 14 days", "4 years, 24 days",
                "13 years, 21 days", "9 months, 1 day" };

        List<Period> periodList = 
                Arrays.stream(arr)
                    .map(s -> Period.parse( 
                                "P"+ s.replaceAll("\\s*years?,?\\s*", "Y")
                                    .replaceAll("\\s*months?,?\\s*", "M")
                                    .replaceAll("\\s*days?,?\\s*", "D")
                            )
                    )
                    .collect(Collectors.toList());
        
        System.out.println(periodList);
        
        // Now you can retrieve years,  months and days from the Period e.g.
        periodList.forEach(p -> 
            System.out.println(
                    p + " => " + 
                    p.getYears() + " years " + 
                    p.getMonths() + " months "+ 
                    p.getDays() +" days"
            )
        );
    }
}

Output:

[P1Y2M22D, P1Y1M14D, P4Y24D, P13Y21D, P9M1D]
P1Y2M22D => 1 years 2 months 22 days
P1Y1M14D => 1 years 1 months 14 days
P4Y24D => 4 years 0 months 24 days
P13Y21D => 13 years 0 months 21 days
P9M1D => 0 years 9 months 1 days

ONLINE DEMO

Learn more about the modern Date-Time API * from Trail: Date Time .

Explanation of the regex :

  • \\s* : Zero or more whitespace characters
  • years? : The word year followed by optional s
  • ,?\\s* : An optional comma followed by zero or more whitespace characters

Solution using Java RegEx API:

Another way of doing it can be by using Matcher#find .

Demo:

import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        String[] arr = { "1 years, 2 months, 22 days", "1 years, 1 months, 14 days", "4 years, 24 days",
                "13 years, 21 days", "9 months, 1 day" };

        int[] years = new int[arr.length];
        int[] months = new int[arr.length];
        int[] days = new int[arr.length];

        Pattern yearPattern = Pattern.compile("\\d+(?= year(?:s)?)");
        Pattern monthPattern = Pattern.compile("\\d+(?= month(?:s)?)");
        Pattern dayPattern = Pattern.compile("\\d+(?= day(?:s)?)");

        for (int i = 0; i < arr.length; i++) {
            Matcher yearMatcher = yearPattern.matcher(arr[i]);
            Matcher monthMatcher = monthPattern.matcher(arr[i]);
            Matcher dayMatcher = dayPattern.matcher(arr[i]);

            years[i] = yearMatcher.find() ? Integer.parseInt(yearMatcher.group()) : 0;
            months[i] = monthMatcher.find() ? Integer.parseInt(monthMatcher.group()) : 0;
            days[i] = dayMatcher.find() ? Integer.parseInt(dayMatcher.group()) : 0;
        }

        // Display
        System.out.println(Arrays.toString(years));
        System.out.println(Arrays.toString(months));
        System.out.println(Arrays.toString(days));
    }
}

Output:

[1, 1, 4, 13, 0]
[2, 1, 0, 0, 9]
[22, 14, 24, 21, 1]

ONLINE DEMO

Explanation of the regex :

  • \\d+ : One or more digits
  • (?= : Start of lookahead assertion pattern
    • year : A whitespace character followed by year
    • (?:s)? : Optional character, s
  • ) : End of lookahead assertion pattern

Check this regex demo to understand the regex more closely.


* If you are working for an Android project and your Android API level is still not compliant with Java-8, check Java 8+ APIs available through desugaring . Note that Android 8.0 Oreo already provides support for java.time .

I would use one regular expression, if the fields must be given in the sequence years , months , days :

var pattern = Pattern.compile("(?:(\\d+) years?),? ?(?:(\\d+) months?),? ?(?:(\\d+) days?)");
var matcher = pattern.matcher(duracao);
if (matcher.matches()) {
    var anos = Integer.parseInt(Objects.requireNonNullElse(m.group(1), "0"));    
    var meses = Integer.parseInt(Objects.requireNonNullElse(m.group(2), "0"));
    var dias = Integer.parseInt(Objects.requireNonNullElse(m.group(3), "0"));
    ...
} else {
    // mensagem de erro
}

Here is another possibility to extract the amount of years,months, days of an item in the list of strings that you have.

I suggest to use a map of regular expressions and a function to match the regular expressions and return the result as a map of ChronoUnit | amount pairs.

Here is some sample code to illustrate my suggestion.

 private static final Map<ChronoUnit,String> durationRegexMap = Map.ofEntries(
        Map.entry(ChronoUnit.YEARS,"\\d+ (years|year)"),
        Map.entry(ChronoUnit.MONTHS,"\\d+ (months|month)"),
        Map.entry(ChronoUnit.DAYS, "\\d+ (days|day)")
);

private Map<ChronoUnit, Integer> parseDuration(String durationString) {
    return new MapStringToChronoUnitsFunction(durationRegexMap)
            .apply(durationString);
}

class MapStringToChronoUnitsFunction implements Function<String, Map<ChronoUnit, Integer>> {

    Map<ChronoUnit,String> durationRegexMap = new HashMap<>();

    public MapStringToChronoUnitsFunction(Map<ChronoUnit,String> durationByRegex) {
        durationRegexMap.putAll(durationByRegex);
    }

    @Override
    public Map<ChronoUnit, Integer> apply(String textWithDurations) {
        String[] splittedTextWithDurations = textWithDurations.split(",");

        return this.durationRegexMap.entrySet().stream()
                .flatMap(regex -> Arrays.stream(splittedTextWithDurations)
                        .map(String::trim)
                        .filter(trimmedDurationString -> trimmedDurationString.matches(regex.getValue()))
                        .map(matchingTrimmedDurationString -> matchingTrimmedDurationString.replaceAll("\\w+[a-zA-Z]", " "))
                        .map(String::trim)
                        .map(t -> Map.entry(regex.getKey(),Integer.valueOf(t))))
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
    }
}

The function call would look like this

Map<ChronoUnit, Integer> durationList = chronoUnitsMapper.parseDuration("1 years, 2 months, 22 days");

The function MapStringToChronoUnitsFunction runs through the regular expressions that are registered in durationRegexMap . It matches each comma separated part of an input string against a regular expression and returns the match result as ChronoUnit and value pair.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM