Python re pattern matching

Question

I am trying to solve a problem of regex identification using re module. I would like to copy some lines beginning with * from a file, the exact line pattern is:

*7  3   279 0

and among the characters there are tabs. My regex to match with the lines is:

regex=re.compile(r'^\*\d+.\n', re.MULTILINE)
for line in f:
    if regexp.match(line)
    print >> a, line

The script I wrote create the file 'a' but it is empty, it cannot recognise the pattern. Have you got some advices?

Moreover, could you explain me the difference between a pattern in double quote and insingle quote? I searched in several python manual but I did not find any info.

Answer 1

You're not capturing the totality of the line with your regex, You'd only be matching lines of type:

*7

Something like ^\\*(?:\\d+\\s+)+$ should work, no need for multiline since you're applying the regex to each line of the file.

Edit: Changed to a non-capturing group, since it's not needed.

Answer 2

Assuming you are ONLY looking for * +number at the beginning of a line, you only need to do this:

regex=re.compile(r'\*\d+')
for line in f:
    if regexp.match(line)
    print >> a, line

If you care the number of numbers found delimited by spaces:

regex=re.compile(r'\*(?:\d+\s+){3}\d+')
for line in f:
    if regexp.match(line)
    print >> a, line

If you use re.match you don't need the ^ anchor. If you use re.search , you do. See the docs

Answer 3

试试这个：

 re.compile(r'^\*\d\s+\d+\s+')

Answer 4

不知道python，但正则表达式似乎应该是^[*][\\d(\\s)*]+$

Python re pattern matching

Question

4 answers

solution1
2 2013-02-14 17:00:32

solution2
1 ACCPTED

solution3
0 2013-02-14 17:00:34

solution4
0 2013-02-14 17:01:50

Python re pattern matching

Question

4 answers

solution1 2 2013-02-14 17:00:32

solution2 1 ACCPTED

solution3 0 2013-02-14 17:00:34

solution4 0 2013-02-14 17:01:50

solution1
2 2013-02-14 17:00:32

solution2
1 ACCPTED

solution3
0 2013-02-14 17:00:34

solution4
0 2013-02-14 17:01:50