简体   繁体   English

正则表达式匹配至少包含指定字符的字符串

[英]Regex which matches a string containing at least the specified characters

I have a huge dictionary which I'm trying to look through using a regex. 我有一本巨大的字典,正尝试使用正则表达式进行浏览。 What I would like to do is to find all the words in the dictionary which contain at least one occurrences of each character I provide in no particular order. 我想做的是找到词典中所有包含的单词,这些单词至少包含一个我不按特定顺序提供的字符。

Right now I can find words which only contain the specified characters but like I said that is not exactly what I want. 现在,我可以找到仅包含指定字符的单词,但是就像我说的那样,这并不是我想要的。

Example: 例:

I want at least one occurrence of each of the following characters {b, a, d} 我希望以下每个字符{b,a,d}至少出现一次

astring.matches(regex) astring.matches(regex)

I would expect words like: 我希望有这样的话:

badder, baddest, baffled 更糟糕最糟糕

Notice they all contain at least one occurence of each character but in no particular order and other characters are present in the strings. 请注意,它们每个字符至少包含一个字符,但没有特定顺序,并且字符串中还存在其他字符。

Anyone know how to do this? 有人知道怎么做吗? Other suggestions are also welcome! 也欢迎其他建议!

You can use a lookahead to do this if it's available 如果可用,可以先行执行

(?=.*b)(?=.*a)(?=.*d)

However this is quite inefficient. 但是,这效率很低。 Any reason you can't use multiple String.indexOf checks? 有什么原因不能使用多个String.indexOf检查?

You need a series of look-aheads: 您需要进行一系列前瞻:

^(?=.*b)(?=.*a)(?=.*d).*

which is a pain to construct. 这是很痛苦的。 However, you can ease the pain by using regex to build it: 但是,您可以使用正则表达式来减轻痛苦:

String regex = "^" + "bad".replaceAll(".", "(?=.*$0)") + ".*";

If using repeatedly with String.matches() , you would be better to use the following code, because every call to String.matches() compiles the regex again (there is no caching): 如果与String.matches()重复使用,则最好使用以下代码,因为每次String.matches()调用都会再次编译正则表达式(不缓存):

// do this once
Pattern pattern = Pattern.compile(regex); 

// reuse the pattern many times
if (pattern.matcher(input).matches())

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 字符串的正则表达式模式至少4个字符,在字符串末尾包含特定字符 - Regex Pattern for String at least 4 characters containing a specific character at the end of String 正则表达式用于不包含空格和至少一个'*'的字符串 - Regex for string containing no whitespace and at least one '*' 需要匹配两个示例字符串的正则表达式 - regex needed which matches for two sample string 如何返回与Java中的正则表达式匹配的字符串 - How to return a string which matches the regex in Java 如何在java中的字符之间使用包含。*的字符串.matches - How to use .matches for string containing .* in between characters in java 包含至少 4 个唯一可打印 ASCII 字符(不包括少数特殊字符)的字符串的正则表达式 - Regex for strings containing at least 4 unique printable ASCII characters excluding few special characters 使用Java Regex将包含未知数目的匹配项的字符串解析为List - Using Java Regex to parse a string containing an unknown number of matches into a List 与字符串也包含方括号的正则表达式匹配 - Regular expression which matches a String also containing brackets 如何在正则表达式中使用Pattern.matches来过滤字符串中不需要的字符 - How to use Pattern.matches with regex to filter unwanted characters in a string Java-Regex用于不包含子字符串后跟任何字符的字符串 - Java- Regex for a string not containing a substring followed by any characters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM