Matching a space at the beginning of a line using pyparsing

Question

I'm trying to parse a unified diff file using pyparsing as an exercise and I can't get something right. Here the part of my diff file that's causing me troubles :

(... some stuff over...)
 banana
+apple
 orange

The first line starts with " " then "banana". I have the following expression for parsing a line :

linestart = Literal(" ") | Literal("+") | Literal("-")
line = linestart.leaveWhitespace() + restOfLine

This works when parsing a single line, but when I try to parse the whole file, the "leaveWhitespace" instruction make the parser start at the end of the last line. In my example, after parsing " banana", the next char is "\\n" (because of leaveWhitespace) and the parser tries to match " " or "+" or "-" and so throws an error.

How can I handle this correctly?

Answer 1

You can read and parse one line at a time. The following code works for me.

from pyparsing import Literal, restOfLine

linestart = Literal(" ") | Literal("+") | Literal("-")
line = linestart.leaveWhitespace() + restOfLine

f = open("/tmp/test.diff")
for l in f.readlines():
  fields = line.parseString(l)
  print fields

And the output is

[' ', 'banana']
['+', 'apple']
[' ', 'orange']

Or if you have to parse several lines, you can explicitly specify the LineEnd

linestart = Literal(" ") | Literal("+") | Literal("-")
line = linestart.leaveWhitespace() + restOfLine + LineEnd()
lines = ZeroOrMore(line)
lines.parseString(f.read())

Matching a space at the beginning of a line using pyparsing

Question

1 answers

solution1
1 2010-11-20 11:28:34

Matching a space at the beginning of a line using pyparsing

Question

1 answers

solution1 1 2010-11-20 11:28:34

solution1
1 2010-11-20 11:28:34