正则表达式格式以从列表中查找特定的字符串

Question

This is not for homework! 这不是为了功课！

Hello, 你好，

Just a quick question about Regex formatting. 关于正则Regex格式的一个简短问题。

I have a list of different courses. 我有不同课程的清单。

L = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']

I was looking for a format to find all the words that start with I , or II , or III . 我正在寻找一种格式，以查找以I或II或III开头的所有单词。 Here is what I did. 这是我所做的。 (I used python fyi) （我用python fyi）

for course in L:
    if re.search("(I?II?III?)*", course):
        L.pop()

I learned that ? 我知道了? in regex means optional. 在正则表达式中表示可选。 So I was thinking of making I , II , and III optional and * to include whatever follows. 因此，我正在考虑使I ， II和III可选，并*包括以下内容。 However, it seems like it is not working as I intended. 但是，它似乎没有按我的预期工作。 What would be a better working format? 什么是更好的工作格式？

Thanks 谢谢

Answer 1

Here is the regex you should use: 这是您应该使用的正则表达式：

^I{1,3}.*$

click here to see example 单击此处查看示例

^ means the head of a line. ^表示一行的开头。 I{1,3} means repeat I 1 to 3 times. I{1,3}表示将I重复1至3次。 .* means any other strings. .*表示任何其他字符串。 $ means the tail of a line. $表示行尾。 So this regex will match all the words that start with I , II , or III . 因此，此正则表达式将匹配以I ， II或III开头的所有单词。

Look at your regex, first, you don't have the ^ mark, so it will match I anywhere. 首先查看您的正则表达式，您没有^标记，因此它将与I匹配。 Second, ? 其次， ? will only affect the previous one character, so the first I is optional, but the second I is not, then the third I is optional, the fourth and fifth I are not, the sixth I is optional. 只会影响前一个字符，因此第一个I是可选的，但是第二个I不是可选的，然后第三个I是可选的，第四个和第五个I不是可选的，第六个I是可选的。 Finally, you use parentheses with * , that means the expression in parentheses will repeat many times include 0 time. 最后，将圆括号与*一起使用，这意味着圆括号中的表达式将重复多次，包括0次。 So it will match 0 I , or at least 3 I . 所以它会匹配0 I ，或至少3 I 。

your regex 您的正则表达式

Answer 2

Instead of search() you can use the function match() that matches the pattern at the beginning of string: 代替search()您可以使用match()函数来match()字符串开头的模式：

import re

l = ['CI101', 'CS164', 'ENGL101', 'I-', 'III-', 'MATH116', 'PSY101']

pattern = re.compile(r'I{1,3}')

[i for i in l if not pattern.match(i)]
# ['CI101', 'CS164', 'ENGL101', 'MATH116', 'PSY101']

正则表达式格式以从列表中查找特定的字符串

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-03-20 04:30:19

解决方案2
1 2019-03-20 07:09:09

正则表达式格式以从列表中查找特定的字符串

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-03-20 04:30:19

解决方案2 1 2019-03-20 07:09:09

解决方案1
3 已采纳 2019-03-20 04:30:19

解决方案2
1 2019-03-20 07:09:09