简体   繁体   English

匹配包含所有元音的单词的正则表达式是什么?

[英]What is the regex to match the words containing all the vowels?

I am learning regex in python but can't seem to get the hang of it.我正在 python 中学习正则表达式,但似乎无法掌握它。 I am trying the filter out all the words containing all the vowels in english and this is my regex:我正在尝试过滤掉所有包含英语元音的单词,这是我的正则表达式:

r'\b(\S*[aeiou]){5}\b'

seems like it is too vague since any vowel(even repeated ones) can appear at any place and any number is times so this is throwing words like 'actionable', 'unfortunate' which do have count of vowels as 5 but not all the vowels.似乎它太模糊了,因为任何元音(甚至是重复的)都可以出现在任何地方,并且任何数字都是次数,所以这是抛出诸如“可操作”、“不幸”之类的词,它们确实将元音计数为 5,但不是所有元音. I looked around the internet and found this regex:我环顾了互联网,发现了这个正则表达式:

r'[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*

But as it appears, its only for the sequential appearance of the vowels, pretty limited task than the one I am trying to accomplish.但看起来,它只是为了元音的顺序出现,比我想要完成的任务非常有限。 Can someone 'think out loud' while crafting the regex for the problem that I have?有人可以在为我遇到的问题制作正则表达式时“大声思考”吗?

If you plan to match words as chunks of text only consisting of English letters you may use a regex like如果您打算将单词匹配为仅由英文字母组成的文本块,您可以使用正则表达式,例如

\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b

See the regex demo查看正则表达式演示

To support languages other than English, you may replace [a-zA-Z]+ with [^\\W\\d_]+ .要支持英语以外的语言,您可以将[a-zA-Z]+替换为[^\\W\\d_]+

If a "word" you want to match is a chunk of non-whitespace chars you may use如果您要匹配的“单词”是一大块非空白字符,您可以使用

(?<!\S)(?=\S*?a)(?=\S*?e)(?=\S*?i)(?=\S*?o)(?=\S*?u)\S+

See this regex demo .请参阅此正则表达式演示

Define these patterns in Python using raw string literals, eg:使用原始字符串文字在 Python 中定义这些模式,例如:

rx_AllVowelWords = r'\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b'

Details细节

  • \\b(?=\\w*?a)(?=\\w*?e)(?=\\w*?i)(?=\\w*?o)(?=\\w*?u)[a-zA-Z]+\\b : \\b(?=\\w*?a)(?=\\w*?e)(?=\\w*?i)(?=\\w*?o)(?=\\w*?u)[a-zA-Z]+\\b :
    • \\b - a word boundary, here, a starting word boundary \\b - 一个词边界,这里是一个起始词边界
    • (?=\\w*?a)(?=\\w*?e)(?=\\w*?i)(?=\\w*?o)(?=\\w*?u) - a sequence of positive lookaheads that are triggered right after the word boundary position is detected, and require the presence of a , e , i , o and u after any 0+ word chars (letters, digits, underscores - you may replace \\w*? with [^\\W\\d_]*? to only check letters) (?=\\w*?a)(?=\\w*?e)(?=\\w*?i)(?=\\w*?o)(?=\\w*?u) - 一个正数序列那些字检测边界位置后立即触发,并要求存在向前看符号aeiou后的任何0+字字符(字母,数字,下划线-你可以取代\\w*?[^\\W\\d_]*?仅检查字母)
    • [a-zA-Z]+ - 1 or more ASCII letters (replace with [^\\W\\d_]+ to match all letters) [a-zA-Z]+ - 1 个或多个 ASCII 字母(替换为[^\\W\\d_]+以匹配所有字母)
    • \\b - a word boundary, here, a trailing word boundary \\b - 一个词边界,这里是一个尾随词边界

The second pattern details:第二个图案细节:

  • (?<!\\S)(?=\\S*?a)(?=\\S*?e)(?=\\S*?i)(?=\\S*?o)(?=\\S*?u)\\S+ : (?<!\\S)(?=\\S*?a)(?=\\S*?e)(?=\\S*?i)(?=\\S*?o)(?=\\S*?u)\\S+ :
    • (?<!\\S) - a position at the start of the string or after a whitespace (?<!\\S) - 字符串开头或空格之后的位置
    • (?=\\S*?a)(?=\\S*?e)(?=\\S*?i)(?=\\S*?o)(?=\\S*?u) - all English vowels must be present - in any order - after any 0+ chars other than whitespace (?=\\S*?a)(?=\\S*?e)(?=\\S*?i)(?=\\S*?o)(?=\\S*?u) - 所有英语元音必须出现 - 以任何顺序 - 在除空格之外的任何 0+ 个字符之后
    • \\S+ - 1+ non-whitespace chars. \\S+ - 1+ 个非空白字符。

I can't think of an easy way to find "words with all vowels" with a single regexp, but it can easily be done by anding-together regex matches to a, e, i, o, and u separately.我想不出一种简单的方法来使用单个正则表达式查找“​​带有所有元音的单词”,但是可以通过将正则表达式分别与 a、e、i、o 和 u 结合在一起来轻松完成。 For example, something like the following Python script should determine whether a given English word has all vowels (in any order, any multiplicity) or not:例如,像下面这样的 Python 脚本应该确定给定的英语单词是否包含所有元音(以任何顺序,任何多重性):

#! /usr/bin/python3
# all-vowels.py
import sys
import re
if len(sys.argv) != 2: sys.exit()
word=sys.argv[1]
if re.search(r'a', word) and re.search(r'e', word) and re.search(r'i', word) and re.search(r'o', word) and re.search(r'u', word):
   print("Word has all vowels!")
else:
   print("Word does NOT have all vowels.")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM