简体   繁体   English

正则表达式格式以从列表中查找特定的字符串

[英]Regex format to find a specific string from the list

This is not for homework! 这不是为了功课!

Hello, 你好,

Just a quick question about Regex formatting. 关于正则Regex格式的一个简短问题。

I have a list of different courses. 我有不同课程的清单。

L = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']

I was looking for a format to find all the words that start with I , or II , or III . 我正在寻找一种格式,以查找以IIIIII开头的所有单词。 Here is what I did. 这是我所做的。 (I used python fyi) (我用python fyi)

for course in L:
    if re.search("(I?II?III?)*", course):
        L.pop()

I learned that ? 我知道了? in regex means optional. 在正则表达式中表示可选。 So I was thinking of making I , II , and III optional and * to include whatever follows. 因此,我正在考虑使IIIIII可选,并*包括以下内容。 However, it seems like it is not working as I intended. 但是,它似乎没有按我的预期工作。 What would be a better working format? 什么是更好的工作格式?

Thanks 谢谢

Here is the regex you should use: 这是您应该使用的正则表达式:

^I{1,3}.*$

click here to see example 单击此处查看示例

^ means the head of a line. ^表示一行的开头。 I{1,3} means repeat I 1 to 3 times. I{1,3}表示将I重复1至3次。 .* means any other strings. .*表示任何其他字符串。 $ means the tail of a line. $表示行尾。 So this regex will match all the words that start with I , II , or III . 因此,此正则表达式将匹配以IIIIII开头的所有单词。

Look at your regex, first, you don't have the ^ mark, so it will match I anywhere. 首先查看您的正则表达式,您没有^标记,因此它将与I匹配。 Second, ? 其次, ? will only affect the previous one character, so the first I is optional, but the second I is not, then the third I is optional, the fourth and fifth I are not, the sixth I is optional. 只会影响前一个字符,因此第一个I是可选的,但是第二个I不是可选的,然后第三个I是可选的,第四个和第五个I不是可选的,第六个I是可选的。 Finally, you use parentheses with * , that means the expression in parentheses will repeat many times include 0 time. 最后,将圆括号与*一起使用,这意味着圆括号中的表达式将重复多次,包括0次。 So it will match 0 I , or at least 3 I . 所以它会匹配0 I ,或至少3 I

your regex 您的正则表达式

Instead of search() you can use the function match() that matches the pattern at the beginning of string: 代替search()您可以使用match()函数来match()字符串开头的模式:

import re

l = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']

pattern = re.compile(r'I{1,3}')

[i for i in l if not pattern.match(i)]
# ['CI101', 'CS164', 'ENGL101', 'MATH116', 'PSY101']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM