Separate text from hyphens line

Question

I am wondering if there is a better approach than what I am currently taking to parse this file. I have a string that is in the general format of:

[Chunk of text]
--------------------
[Another chunk of text]

(There can be multiple chunks of text with the same separator between them)

I am trying to parse the chunks of text into elements of a list, which I can do with data.split('-'*20) [in this case], however if there are not exactly 20 hyphens the split will not work as intended. I have been playing around with regex however am currently unsure of a proper regex that could be used.

Are there any better methods that I should use in this situation, or is there a regex I should use oppose to the .split() method?

Answer 1

I would try to use re.split() with the regex --+ which means:

- - one hyphen
-+ - one or more hyphens

... this way it would not match a single hyphen, but everything more than one, alternatively you could use -{2,} which means two or more.

Answer 2

You want a regex split. I'm not python-literate, but I found the function in the official 2.7.10 documentation , and modified to your case:

>>> re.split('\n\-{4,}\n', input)

4 is the minimum amount of dashes you want to match.
\\n are the newlines before and after. You probably don't want those in your text.

Separate text from hyphens line

Question

2 answers

solution1
1 2015-07-13 15:36:32

solution2
1 ACCPTED 2015-07-13 16:03:04

Separate text from hyphens line

Question

2 answers

solution1 1 2015-07-13 15:36:32

solution2 1 ACCPTED 2015-07-13 16:03:04

solution1
1 2015-07-13 15:36:32

solution2
1 ACCPTED 2015-07-13 16:03:04