This is not for homework!
Hello,
Just a quick question about Regex
formatting.
I have a list of different courses.
L = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']
I was looking for a format to find all the words that start with I
, or II
, or III
. Here is what I did. (I used python fyi)
for course in L:
if re.search("(I?II?III?)*", course):
L.pop()
I learned that ?
in regex means optional. So I was thinking of making I
, II
, and III
optional and *
to include whatever follows. However, it seems like it is not working as I intended. What would be a better working format?
Thanks
Here is the regex you should use:
^I{1,3}.*$
^
means the head of a line. I{1,3}
means repeat I
1 to 3 times. .*
means any other strings. $
means the tail of a line. So this regex will match all the words that start with I
, II
, or III
.
Look at your regex, first, you don't have the ^
mark, so it will match I
anywhere. Second, ?
will only affect the previous one character, so the first I
is optional, but the second I
is not, then the third I
is optional, the fourth and fifth I
are not, the sixth I
is optional. Finally, you use parentheses with *
, that means the expression in parentheses will repeat many times include 0 time. So it will match 0 I
, or at least 3 I
.
Instead of search()
you can use the function match()
that matches the pattern at the beginning of string:
import re
l = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']
pattern = re.compile(r'I{1,3}')
[i for i in l if not pattern.match(i)]
# ['CI101', 'CS164', 'ENGL101', 'MATH116', 'PSY101']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.