[英]How to match any combination of letters using regex?
How can I match letters a,b,c once in any combination and varying length like this:如何以任意组合和不同长度匹配字母 a,b,c 一次,如下所示:
The expression should match these cases:该表达式应匹配以下情况:
abc
bc
a
b
bca
but should not match these ones:但不应与这些匹配:
abz
aab
cc
x
Use regex pattern使用正则表达式
\b(?!\w*(\w)\w*\1)[abc]+\b
You can use this pattern with any set and size, just replace [abc]
with desired set...您可以将此模式与任何集合和大小一起使用,只需将[abc]
替换为所需的集合...
(above output is from myregextester ) (以上输出来自myregextester )
^(?=([^a]*a?[^a]*)$)(?=([^b]*b?[^b]*)$)(?=([^c]*c?[^c]*)$)[abc]{1,3}$
This works with lookaheads .这适用于lookaheads 。
It includes this pattern in three variations: (?=([^a]*a?[^a]*)$)
它以三种变体形式包含此模式: (?=([^a]*a?[^a]*)$)
It says: There needs to be at most one a
from here (the beginning) until the end.它说:从这里(开始)到结束最多需要一个a
。
Combining lookaheads and backreferences :结合前瞻和反向引用:
^([abc])((?!\1)([abc])((?!\1)(?!\3)[abc])?)?$
Just to round out the collection:只是为了完善集合:
^(?:([abc])(?!.*\1))+$
Want to handle a larger set of characters?想要处理更大的字符集? No problem:没问题:
^(?:([abcdefgh])(?!.*\1))+$
EDIT: Apparently I misread the question;编辑:显然我误读了这个问题; you're not validating individual strings like "abc"
and "ba"
, you're trying to find whole-word matches in a larger string.您不是在验证像"abc"
和"ba"
这样的单个字符串,而是试图在更大的字符串中找到全字匹配。 Here's how I would do that:这是我将如何做到的:
\b(?:([abc])(?![abc]*\1))+\b
The tricky part is making sure the lookahead doesn't look beyond the end of the word that's currently being matched.棘手的部分是确保前瞻不会超出当前匹配的单词的末尾。 For example, if I had left the lookahead as (?!.*\\1)
, it would fail to match the abc
in abc za
because the lookahead would incorrectly flag the a
in za
as a duplicate of the a
in abc
.例如,如果我离开先行的(?!.*\\1)
它会失败,以匹配abc
在abc za
因为先行错误地将标志a
以za
为一体的重复a
在abc
。 Allowing the lookahead to look only at valid characters ( [abc]*
) keeps it on a sufficiently short leash.允许前瞻只查看有效字符 ( [abc]*
) 使其保持足够短的皮带。 And if there are invalid characters in the current word, it's not the lookahead's job to spot them anyway.如果当前单词中存在无效字符,则无论如何发现它们不是前瞻的工作。
(Thanks to Honest Abe for bringing this back to my attention.) (感谢Honest Abe让我重新注意到这一点。)
^(?=(.*a.*)?$)(?=(.*b.*)?$)(?=(.*c.*)?$)[abc]{,3}$
锚定前瞻将每个字母的出现次数限制为一个。
I linked it in comment (this is sort of a dupe of How can I find repeated characters with a regex in Java? ).. but to be more specific.. the regex:我在评论中链接了它(这有点像如何在 Java 中找到带有正则表达式的重复字符? )..但更具体地说..正则表达式:
(\w)\1+
Will match any two or more of the same character.将匹配任意两个或多个相同字符。 Negate that and you have your regex.否定它,你就有了你的正则表达式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.