[英]Regex format to find a specific string from the list
This is not for homework! 这不是为了功课!
Hello, 你好,
Just a quick question about Regex
formatting. 关于正则
Regex
格式的一个简短问题。
I have a list of different courses. 我有不同课程的清单。
L = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']
I was looking for a format to find all the words that start with I
, or II
, or III
. 我正在寻找一种格式,以查找以
I
或II
或III
开头的所有单词。 Here is what I did. 这是我所做的。 (I used python fyi)
(我用python fyi)
for course in L:
if re.search("(I?II?III?)*", course):
L.pop()
I learned that ?
我知道了
?
in regex means optional. 在正则表达式中表示可选。 So I was thinking of making
I
, II
, and III
optional and *
to include whatever follows. 因此,我正在考虑使
I
, II
和III
可选,并*
包括以下内容。 However, it seems like it is not working as I intended. 但是,它似乎没有按我的预期工作。 What would be a better working format?
什么是更好的工作格式?
Thanks 谢谢
Here is the regex you should use: 这是您应该使用的正则表达式:
^I{1,3}.*$
click here to see example 单击此处查看示例
^
means the head of a line. ^
表示一行的开头。 I{1,3}
means repeat I
1 to 3 times. I{1,3}
表示将I
重复1至3次。 .*
means any other strings. .*
表示任何其他字符串。 $
means the tail of a line. $
表示行尾。 So this regex will match all the words that start with I
, II
, or III
. 因此,此正则表达式将匹配以
I
, II
或III
开头的所有单词。
Look at your regex, first, you don't have the ^
mark, so it will match I
anywhere. 首先查看您的正则表达式,您没有
^
标记,因此它将与I
匹配。 Second, ?
其次,
?
will only affect the previous one character, so the first I
is optional, but the second I
is not, then the third I
is optional, the fourth and fifth I
are not, the sixth I
is optional. 只会影响前一个字符,因此第一个
I
是可选的,但是第二个I
不是可选的,然后第三个I
是可选的,第四个和第五个I
不是可选的,第六个I
是可选的。 Finally, you use parentheses with *
, that means the expression in parentheses will repeat many times include 0 time. 最后,将圆括号与
*
一起使用,这意味着圆括号中的表达式将重复多次,包括0次。 So it will match 0 I
, or at least 3 I
. 所以它会匹配0
I
,或至少3 I
。
Instead of search()
you can use the function match()
that matches the pattern at the beginning of string: 代替
search()
您可以使用match()
函数来match()
字符串开头的模式:
import re
l = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']
pattern = re.compile(r'I{1,3}')
[i for i in l if not pattern.match(i)]
# ['CI101', 'CS164', 'ENGL101', 'MATH116', 'PSY101']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.