没有元音的单词如何匹配？

Question

The world of vowel and around could be subjective, so I've these set of rules: 元音及其周围世界可能是主观的，因此我有以下规则：

A vowel is any of a, e, i, o, u. 元音是a，e，i，o，u中的任何一个。 Not y. 不对
A word is a sequence of English language letters, az, AZ. 单词是英语字母az，AZ的序列。
\\n , , (comma), . \\n ， , （逗号）， . (period) or （句点）或 (space) are not part of the word. （空格）不是单词的一部分。

I have following string: 我有以下字符串：

text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""

My try: 我的尝试：

s = re.findall(r'[^aeiouAEIOU].*', text)
print(s)

Expectation: 期望：

['sntshk', 'xx', 'yy', 'zz']

Reality: 现实：

['line with every word a vowel', '\nsntshk xx yy.', '\nOkay zz fine.']

Related: Search all words with no vowels 相关：搜索所有没有元音的单词

Answer 1

I would just target using the pattern \\b[^AEIOU_0-9\\W]+\\b in case insensitive mode: 在不区分大小写的模式下，我只使用\\b[^AEIOU_0-9\\W]+\\b模式定位：

text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""

re.findall(r'\b[^AEIOU_0-9\W]+\b', text, flags=re.I)
print(s)

['sntshk', 'xx', 'yy', 'zz']

The pattern [^\\W] in fact is a double negative, and means any word character. 模式[^\\W]实际上是一个双负号，表示任何单词字符。 To this negative class we blacklist off vowels, digits, and underscore, leaving only consonants. 对于此否定类，我们将元音，数字和下划线黑名单化，仅保留辅音。

Answer 2

Use an ordinary character set composed of alphabetical characters, excluding the vowels, with word boundaries at each end: 使用由字母字符组成的普通字符集（元音除外），两端各有一个单词边界：

(?i)\b[b-df-hj-np-tv-z]+\b

https://regex101.com/r/DqGuY1/1 https://regex101.com/r/DqGuY1/1

(?i) - Case-insensitive match (?i) -不区分大小写的匹配
\\b - Word boundary \\b字边界
[b-df-hj-np-tv-z]+ - Repeat one or more of: [b-df-hj-np-tv-z]+ -重复以下一项或多项：
- characters in the range of bd , or fh , or jn , or pt , or vz bd或fh或jn或pt或vz范围内的字符
\\b - Word boundary \\b字边界

More readably, but less elegantly, you could also use 您也可以使用更易读但不太优雅的方法

(?i)\b(?:(?![eiou])[b-z])+\b

Answer 3

There is a pure Python way you can do this without any imports: 您可以使用一种纯Python的方式来执行此操作，而无需任何导入：

[x.strip('.') for x in text.split() if all(y.lower() not in 'aeiou' for y in x)]

Example : 范例：

text = """line with every word a vowel 
sntshk xx yy.
Okay zz fine."""

print([x.strip('.') for x in text.split() if all(y.lower() not in 'aeiou' for y in x)])
# ['sntshk', 'xx', 'yy', 'zz']

Answer 4

[^aeiouAEIOU]

This means match anything except aeiouAEIOU so it will match characters other than alphabets too which is not required as you want to get words only, 这意味着匹配除aeiouAEIOU之外的任何其他aeiouAEIOU因此它也将匹配除字母之外的其他字符，这不是必需的，因为您只想获取单词，

so simply match all the alphabets other than vowels 因此只需匹配元音以外的所有字母

\b[bcdfghjklmnpqrstvwxyz]+\b

Regex Demo

Answer 5

This works: 这有效：

text = """line with every word a vowel
sntshk xx yy.
Okay zz fine."""
q = ''
s = text.split()
for i in range(len(s)):
    c = 0
    s[i] = s[i].strip('.')
    for c in range(len(s[i])):
        if (s[i])[c].lower() in 'aeiou':
            q += s[i]+' '
            break
print(q)

没有元音的单词如何匹配？

问题描述

5 个解决方案

解决方案1
2 2019-08-18 04:16:13

解决方案2
2 已采纳 2019-08-18 04:16:16

解决方案3
2 2019-08-18 04:25:26

解决方案4
1 2019-08-18 04:20:38

解决方案5
0 2019-08-18 04:59:09

没有元音的单词如何匹配？

问题描述

5 个解决方案

解决方案1 2 2019-08-18 04:16:13

解决方案2 2 已采纳 2019-08-18 04:16:16

解决方案3 2 2019-08-18 04:25:26

解决方案4 1 2019-08-18 04:20:38

解决方案5 0 2019-08-18 04:59:09

解决方案1
2 2019-08-18 04:16:13

解决方案2
2 已采纳 2019-08-18 04:16:16

解决方案3
2 2019-08-18 04:25:26

解决方案4
1 2019-08-18 04:20:38

解决方案5
0 2019-08-18 04:59:09