Given a string containing four values:
1) Vehicle model <- any number of alpha-numeric words
2) Engine description <- one word before the next value:
3) Power output <- \d+KW
4) Optional keywords <- any number of alpha-numeric words
For example:
1-SERIE 118I 105KW EFF.DYN. BUSINESS LINE
MINI CLUBMAN 1.6T 128KW COOPER S
TWINGO 1.2 55KW
How to extract these into Python variables using re?
I think the simplest approach is to first find the power output (an anchor point), and then match the previous word to find the engine description , and then match everything before that to retrieve the model . Also match everything after the power output to find the optional keywords .
I feel I need to do something with (?<= ..) but I can't get it to work..
Slightly modified from Matt G. (added named groups and matches all optional keywords):
^(?P<model>([\S\s]+?))(?= \S+(?= \d+KW)) (?P<engine>(\S+))(?=(?= \d+KW)) (?P<kw>(\d+))KW(?P<keywords>(?<=KW)\s?(.*))
Try Regex: ^([\\S\\s]+?)(?= \\S+(?= \\d+KW)) (\\S+)(?=(?= \\d+KW)) (\\d+)KW(?: ([^\\s]+))*
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.