简体   繁体   English

Java正则表达式单词匹配

[英]Java regular expression word match

I have 3 values IU, PRI and RET . 我有3个值IU,PRIRET if my input string contains any one or more value(s), 如果我的输入字符串包含一个或多个值,
the Java regular expression should return true. Java正则表达式应返回true。

Ex:
Values : IU PRI RET 
Input String : "put returns UI between paragraphs"

The Input string contains "UI" word, the Java regular expression should return true. 输入字符串包含“ UI”字样,Java正则表达式应返回true。

You need word boundaries for that: 为此,您需要单词边界

boolean foundMatch = false;
Pattern regex = Pattern.compile("\\b(?:UI|PRI|RET)\\b");
Matcher regexMatcher = regex.matcher(subjectString);
foundMatch = regexMatcher.find();

Try 尝试

String s= "A IU somehting PRI something RET whatever";

Pattern p= Pattern.compile("(IU|PRI|RET)");
Matcher m= p.matcher(s);
while (m.find()) {
    String matched= m.group(1);
    System.out.println(matched);
}

It prints: 它打印:

IU
PRI
RET

I don't know if you are still looking for the solution of this. 我不知道您是否还在寻找解决方案。 But here's the code for your question. 但是,这是您的问题代码。 I assumed that the anagrams you are looking for are separated by spaces and the words appear in Uppercase. 我假设您要查找的字谜之间用空格隔开,并且单词以大写形式出现。

    String text = "put returns UI between IU paragraphs PRI RIP and RET ETR";
    Pattern p = Pattern.compile("([UI]{2}|[PRI]{3}|[RET]{3})");

    Matcher m = p.matcher(text);
    System.out.println(m.find());

If you are try for case insensitive matching, change the pattern to the following; 如果尝试不区分大小写的匹配,请将模式更改为以下格式;

    (?i)([UI]{2}|[PRI]{3}|[RET]{3})

Ok here's a crazy solution with anagrams of each given String , built into a Pattern just for fun: 好的,这是一个疯狂的解决方案,其中包含每个给定String字谜,它们内置在Pattern只是为了好玩:

public static void main(String[] args) {
    try {
        Pattern pattern = makePattern("IU", "PRI", "RET");
        System.out.println(pattern.pattern());
        String test = "put returns UI between paragraphs, also IRP and TER";
        Matcher matcher = pattern.matcher(test);
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
    catch (Exception e) {
        e.printStackTrace();
    }
}
public static Pattern makePattern(String... words) throws Exception {
    if (words == null || words.length == 0) {
        throw new Exception("TODO handle invalid argument");
    }
    StringBuilder patternBuilder = new StringBuilder("(");
    for (String word : words) {
        if (word == null || word.isEmpty()) {
            throw new Exception("TODO invalid word");
        }
        for (String anagram: doAnagrams(word, null)) {
            patternBuilder.append("\\b").append(anagram).append("\\b").append("|");
        }
    }
    patternBuilder.deleteCharAt(patternBuilder.length() - 1);
    patternBuilder.append(")");
    return Pattern.compile(patternBuilder.toString());
}
public static Set<String> doAnagrams(String original, Set<String> processed) {
    if (original == null || original.isEmpty()) {
        return new LinkedHashSet<String>();
    }
    Set<String> result;
    if (processed == null) {
        result = new LinkedHashSet<String>();
        result.add(original);
    } else {
        result = processed;
    }
    if (original.length() <= 1) {
        return result;
    }
    String sub = original.substring(1);
    String subStart = original.substring(0, 1);
    for (String subAnagram : doAnagrams(sub, null)) {
        result.add(subAnagram.concat(subStart));
    }
    if (sub.concat(original.substring(0, 1)).equals(result.iterator().next())) {
        return result;
    } 
    else {
        return doAnagrams(sub.concat(subStart), result);
    }
}

Output : 输出

(\bIU\b|\bUI\b|\bPRI\b|\bRIP\b|\bIRP\b|\bIPR\b|\bPIR\b|\bRPI\b|\bRET\b|\bETR\b|\bTER\b|\bTRE\b|\bRTE\b|\bERT\b)
UI
IRP
TER

您可以在一行中完成此操作,然后获取布尔值。

boolean matcher = Pattern.matches("[UI]{2}|[PRI]{3}|[RET]{3}", stringToBeMatched);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM