I have the following string to be parsed:
Field 1:Value 1 Overriden Field 2: Value 2.1 Value 2.2 Field 3: Value 3 Overriden Field 4:Value 4 Field 5:Value5
Basically the field-value pairs are separated by a colon, and a field (doesn't always start with "Field ...") starts at a new line followed by a colon. I want to extract the overriden field-value pairs, so I can have two (or multiple) strings: one as "Overriden Field 2:...Value 2.2" and one as "Overriden Field 4:Value 4".
I don't know how many overriden fields there are, but they all start with "Overriden". I'm not sure a grouping can help.
The best I can think of is to use re.findAll() to search for occurrences of "Overriden[^:] :[^:] :?", so I will get two results:
And then I will have to chop off the last part "\\n[^:]*:". This doesn't look smart.
Anyone would like to give some advice?
You can perhaps use something like this:
\s*([^:]+)\s*:\s*((?:[^:](?![^:\n]+:))+)\s*
[I put the \\s*
just to avoid trailing spaces and/or newlines, they can be removed without changing the core content to get].
The regex started as:
([^:]+):([^:]+)
Then I changed the second part to ((?:[^:](?![^:\\n]+:))+)
which makes sure there isn't a :
on the same line (which would mean it is going into a field on top of a value).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.