![](/img/trans.png)
[英]Regex in Python to find words that follow pattern: vowel, consonant, vowel, consonant
[英]python regex to calculate vowel/consonant ratio in English
我已經開始了一個相當愚蠢的語言學項目來學習 Python 中的正則表達式。 我很確定我可以避免多次通過同一個字符串,並找到一種更“緊湊”和“pythonic”的方式來做我想做的事情,即:使用正則表達式計算是否'Y|y'總之是元音或輔音。 在代碼段的底部,我放入了一個包含 12 個元音 y 和 9 個輔音 y 的 20 個單詞的注釋塊。 似乎可以簡化代碼並將 re.compile 行合並在一起。
import re
vowelRegex = re.compile(r'[aeiouAEIOU]')
consoRegex = re.compile(r'[b-df-hj-np-tv-xzB-DF-HJ-NP-TV-XZ]')
yconsRegex = re.compile(r'[aeiou]y[aeiou]')
ycon2Regex = re.compile(r'\bY')
yVowlRegex = re.compile(r'[b-df-hj-np-tv-xzB-DF-HJ-NP-TV-XZ]y[b-df-hj-np-tv-xz]')
yVow2Regex = re.compile(r'y\b')
#thestring = 'Sky Family Yurt Germany Crypt Day New York Pennsylvania Myth Hungry Yolk Year Bayou Yak Silly Beyond Dynamite Mystery Yacht Yoda'
#thestring = 'Crypt Pennsylva Myth Dynamite Mystery'
#thestring='RoboCop eats baby food. Pennsylvania Baby Food in the bayou. And, New York is where I\'d Rather be!'
thestring='violent irrational intolerant allied to racism and ' \
'tribalism bigotry invested in ignorance and hostile to free '\
'inquiry contemptuous of women and coercive towards children ' \
'organized religion ought to have a great deal on its conscience ' \
'Yak yacht beyond mystery'
fun=vowelRegex.findall(thestring)
nofun=consoRegex.findall(thestring)
funny = yVowlRegex.findall(thestring)
foony = []
for f in funny:
foony.append (f[1])
fun += foony
fun += yVow2Regex.findall(thestring)
notfunny = yconsRegex.findall(thestring)
foony = []
for f in notfunny:
foony.append (f[1])
nofun += foony
nofun += ycon2Regex.findall(thestring)
print(thestring)
print('Vowels:',''.join(fun), len(''.join(fun)))
print('Consos:',''.join(nofun), len(''.join(nofun)))
'''
Sky Vowel; endswith 1
Family Vowel; endswith 2
Yurt Consonant; begswith 1
Germany Vowel; endswith 3
Crypt Vowel; sandwiched 1
Day Vowel; endswith 4
New York Consonant; begswith 2
Pennsylva Vowel; sandwiched 2
Myth Vowel; sandwiched 3
Hungry Vowel; endswith 5
Yolk Consonant; begswith 3
Year Consonant; begswith 4
Bayou Consonanwich 1
Yak Consonant; begswith 5
Silly Vowel; endswith 6
Beyond Consonanwich 2
Dynamite Vowel; sandwiched 4
Mystery Vowel; sandwiched, Vowel; endswith!
Yacht Consonant; begswith 6
Yoda Consonant; begswith 7
'''
您可以在正則表達式中使用 or 運算符,這可以減少一點。 例如:
yVowlRegex = re.compile(r'[b-df-hj-np-tv-xzB-DF-HJ-NP-TV-XZ]y[b-df-hj-np-tv-xz]|y\b')
現在包括 yVowl 和 yVow2
@Joshua-Lewis 的回答讓我采用了以下方法來簡化上面的代碼:
import re
vowelRegex = re.compile(r'[aeiouAEIOU]|[b-df-hj-np-tv-xzB-DF-HJ-NP-TV-XZ]y[b-df-hj-np-tv-xz]|y\b')
consoRegex = re.compile(r'[b-df-hj-np-tv-xzB-DF-HJ-NP-TV-XZ]|[aeiou]y[aeiou]|\bY')
vowelRescan = re.compile(r'[aeiouyAEIOUY]')
consoRescan = re.compile(r'[b-df-hj-np-tv-xyzB-DF-HJ-NP-TV-XYZ]')
thestring='any and every religion is violent irrational intolerant '\
'allied to racism and tribalism bigotry invested in ignorance and '\
'hostile to free inquiry contemptuous of women and coercive towards '\
'children organized religion ought to have a great deal on its '\
'conscience why it continues toward the 22nd century ACE is a mystery '\
'known only to New Yorkers and lovers of the bayou'
fun=vowelRegex.findall(thestring)
funn=''.join(fun)
fun = ''.join(vowelRescan.findall(funn))
nofun=consoRegex.findall(thestring)
nofunn=''.join(nofun)
nofun=''.join(consoRescan.findall(nofunn))
print(thestring)
print('Vowels:',fun, len(fun))
print('Consos:',nofun, len(nofun))
'''
Sky Vowel; endswith 1
Family Vowel; endswith 2
Yurt Consonant; begswith 1
Germany Vowel; endswith 3
Crypt Vowel; sandwiched 1
Day Vowel; endswith 4
New York Consonant; begswith 2
Pennsylva Vowel; sandwiched 2
Myth Vowel; sandwiched 3
Hungry Vowel; endswith 5
Yolk Consonant; begswith 3
Year Consonant; begswith 4
Bayou Consonanwich 1
Yak Consonant; begswith 5
Silly Vowel; endswith 6
Beyond Consonanwich 2
Dynamite Vowel; sandwiched 4
Mystery Vowel; sandwiched, Vowel; endswith!
Yacht Consonant; begswith 6
Yoda Consonant; begswith 7
'''
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.