简体   繁体   中英

Python regex an optional character and a group of characters

I'm trying to write a python regex statement that will work in the following way...A vowel (a,e,i,o,u,á,é,í,ó,ú) that could be followed by a colon ( : ) that is followed by another vowel (a,e,i,o,u,á,é,í,ó,ú) that has to be followed by another colon. So the colon in between the vowels is optional but if it there it has to be present in the output. That's why below I tried to use (:)? . If that pattern is matched the last colon will be dropped. The vowels with the acute accents are considered totally different vowels. So a is considered a different vowel than á . Below is another representation.

V = a,e,i,o,u,á,é,í,ó,ú
V:V: will become V:V
VV: will become VV

Notice in both patterns the colon after the second vowel is always dropped. But if the colon is present between both vowels it will be present in the output.

So below are a few patterns that should apply and what it should be become.

a:é: will become a:é // colon between the vowels is present in the output, colon after the two vowels is dropped from output
ia: will become ia // colon after the two vowels is dropped from output
ó:a: will become óa // colon between the vowels is present in the output, colon after the two vowels is dropped from output

Below is what I've been trying but it's not working:

word = re.sub(ur"([a|e|i|o|u|á|é|í|ó|ú])(:)?([a|e|i|o|u|á|é|í|ó|ú]):", ur'\1\3', word) 

Your example patterns are not consistent with your description. Here are some example patterns and an RegEx which matches your description.

Code:

import re
V = u'aeiouáéíóú'
RE = re.compile('([%s])(:?)([%s]):' % (V, V))

word = RE.sub(r'\1\2\3', word)

Test Code:

data = (
    (u'a:é:', u'a:é'),
    (u'ia:', u'ia'),
    (u'ó:a', u'ó:a'),
)

for w1, w2 in data:
    print(w2, RE.sub(r'\1\2\3', w1))
    assert w2 == RE.sub(r'\1\2\3', w1)

Results:

(u'a:\xe9', u'a:\xe9')
(u'ia', u'ia')
(u'\xf3:a', u'\xf3:a')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM