简体   繁体   English

Python正则表达式从字符串中的锚点提取上一组/下一组

[英]Python regex extract previous/next group from anchor point in string

Given a string containing four values:给定一个包含四个值的字符串:

1) Vehicle model        <- any number of alpha-numeric words
2) Engine description   <- one word before the next value:
3) Power output         <- \d+KW
4) Optional keywords    <- any number of alpha-numeric words

For example:
1-SERIE 118I 105KW EFF.DYN. BUSINESS LINE
MINI CLUBMAN 1.6T 128KW COOPER S
TWINGO 1.2 55KW

How to extract these into Python variables using re?如何使用 re 将这些提取到 Python 变量中?

I think the simplest approach is to first find the power output (an anchor point), and then match the previous word to find the engine description , and then match everything before that to retrieve the model .我觉得最简单的方法是先找到power output (一个锚点),然后匹配前面的词来找到engine description ,然后匹配之前的一切来检索model Also match everything after the power output to find the optional keywords .还匹配电源输出后的所有内容以查找可选关键字

I feel I need to do something with (?<= ..) but I can't get it to work..我觉得我需要用 (?<= ..) 做点什么,但我无法让它工作..

Slightly modified from Matt G. (added named groups and matches all optional keywords):从 Matt G. 略微修改(添加命名组并匹配所有可选关键字):

^(?P<model>([\S\s]+?))(?= \S+(?= \d+KW)) (?P<engine>(\S+))(?=(?= \d+KW)) (?P<kw>(\d+))KW(?P<keywords>(?<=KW)\s?(.*))

Try Regex: ^([\\S\\s]+?)(?= \\S+(?= \\d+KW)) (\\S+)(?=(?= \\d+KW)) (\\d+)KW(?: ([^\\s]+))*试试正则表达式: ^([\\S\\s]+?)(?= \\S+(?= \\d+KW)) (\\S+)(?=(?= \\d+KW)) (\\d+)KW(?: ([^\\s]+))*

Demo演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM