I'm having to parse a text dump of a spreadsheet. I have a regular expression that correctly parses each line of the data, but it's rather long. It's basically just matching a certain pattern 12 or 13 times.
The pattern I want to repeat is
\s+(\w*\.*\w*);
This is the regular expression (shortened)
^\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);\s+(\w*\.*\w*);
Is there a way to match a pattern a set number of times without copy pasting like this? Each of those sections correspond to data columns, all of which I need. I'm using Python by the way. Thanks!
(\\s+(\\w*\\.*\\w*);){12}
The {n}
is a "repeat n times"
if you want "12 - 13" times,
(\\s+(\\w*\\.*\\w*);){12,13}
if you want "12+" times,
(\\s+(\\w*\\.*\\w*);){12,}
How about using:
[x.group() for x in re.finditer(r'(\s+(\w*\.*\w*);)*', text)]
Did you find the findall
method yet? Or consider splitting at ;
?
map(lambda x: x.strip(), s.split(";"))
is probably what you really want.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.