简体   繁体   中英

Extract variable values from String

Given that the user can enter values in specific formats only, I need to extract relevant sections of that string into Java variables.

Say for instance acceptable formats are:-

String types[] = {"The quick brown ${animal} jumped over the lazy ${target}.",
                  "${target} loves ${animal}.",
                  "${animal} became friends with ${target}"};

Variables:-

private String animal;
private String target;

Now if the user enters "The quick brown fox jumped over the lazy dog." , the animal variable should be set to "fox" and target variable should be set to "dog" .

If the user input matches none of the given types, it should display an error.

Basically I am trying to do the inverse of what org.apache.commons.lang.text.StrSubstitutor does.

My approach (looks inefficient, hence asking for help):-

Create regex patterns to find out the type of the entered string and then write different logic for each of the type. For example, for the first type, get the word after the word "brown" and assign it to variable animal and so on.

Using @Josh Withee's answer :-

/**
 * @param input         String from which values need to be extracted
 * @param templates     Valid regex patterns with capturing groups
 * @param variableNames Names for named capturing groups
 * @return Map with variableNames as the keys and the extracted strings as map values
 * OR an empty, non-null map if the input doesn't match with any template, or if there is no group with the given variableNames
 */
public static Map<String, String> extractVariablesFromString(String input, List<String> templates, String... variableNames) {
        Map<String, String> resultMap = new HashMap<>();
        Optional<String> matchedTemplate = templates.stream().filter(input::matches).findFirst();
        matchedTemplate.ifPresent(t -> {
            Matcher m = Pattern.compile(t).matcher(input);
            m.find();
            Arrays.stream(variableNames)
                    .forEach(v -> {
                        try {
                            resultMap.put(v, m.group(v));
                        } catch (IllegalArgumentException e) {
                        }
                    });
        });
        return resultMap;
    }

Tests:-

    @Test
    public void shouldExtractVariablesFromString() {
        String input = "The quick brown fox jumped over the lazy dog.";
        String types[] = {"The quick brown (?<animal>.*) jumped over the lazy (?<target>.*).",
                "(?<target>.*) loves (?<animal>.*).",
                "(?<animal>.*) became friends with (?<target>.*)"};
        Map<String, String> resultMap = StringUtils.extractVariablesFromString(input, Arrays.asList(types), "animal", "target1", "target");
        Assert.assertEquals("fox", resultMap.get("animal"));
        Assert.assertEquals("dog", resultMap.get("target"));
        Assert.assertFalse(resultMap.containsKey("target1"));
    }

    @Test
    public void shouldReturnEmptyMapIfInputDoesntMatchAnyPatternForVariableExtraction() {
        String input = "The brown fox passed under the lazy dog.";
        String types[] = {"The quick brown (?<animal>.*) jumped over the lazy (?<target>.*).",
                "(?<animal>.*) became friends with (?<target>.*)"};
        Map<String, String> resultMap = StringUtils.extractVariablesFromString(input, Arrays.asList(types), "animal", "target1", "target");
        Assert.assertTrue(resultMap.isEmpty());
    }

You can do this with named capture groups:

String userInput = "dog loves fox.";

String types[] = {"The quick brown (?<animal>.*?) jumped over the lazy (?<target>.*?).",
                  "(?<target>.*?) loves (?<animal>.*?).",
                  "(?<animal>.*?) became friends with (?<target>.*?)"};

Matcher m;

for(int i=0; i<types.length(); i++;){
    if(userInput.matches(types[i]){
        m = Pattern.compile(types[i]).matcher(userInput);
        break;
    }
}

m.find();

String animal = m.group("animal");
String target = m.group("target");
/**
 *
 * @param input /Volumes/data/tmp/send/20999999/sx/0000000110-0000000051-007-20211207-01.txt
 * @param template /{baseDir}/send/{yyyyMMdd}/{organization}/{sendOrganization}-{receiveOrganization}-{fileType}-{date}-{batch}.txt
 * @param prefix
 * @param suffix
 * @return
 */
public static Map<String, String> extractVariables(final String input, final String template, final String prefix, final String suffix) {
    final HashSet<String> variableNames = new HashSet<>();
    String variableNamesRegex = "(" + prefix + "([^" + prefix + suffix + "]+?)" + suffix + ")";
    Pattern variableNamesPattern = Pattern.compile(variableNamesRegex);
    Matcher variableNamesMatcher = variableNamesPattern.matcher(template);
    while (variableNamesMatcher.find()) {
        variableNames.add(variableNamesMatcher.group(2));
    }
    final String regexTemplate = template.replaceAll(prefix, "(?<").replaceAll(suffix, ">.*)");
    Map<String, String> resultMap = new HashMap<>();
    Matcher matcher = Pattern.compile(regexTemplate).matcher(input);
    matcher.find();
    variableNames.forEach(v -> resultMap.put(v, matcher.group(v)));
    return resultMap;
}

usage like

extractVariables(input, template2, "\\{", "\\}")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM