简体   繁体   中英

Parse a substring in Python using regular expression

I am trying to parse a substring using re.

From the string present in variable s ,I would like to split the string present till the first ! (the string stored in s has two ! ) and store it as a substring.From this substring(stored in variable result ), I wish to parse another substring.

Here is the code,

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#!'


Data={}

result = re.search('%s(.*)%s' % ('ec', '!'), s).group(1)
print result
ecNumber = re.search('%s(.*)%s' % ('Number*', '#kmValue*'), result).group(1)
Data["ecNumber"]=ecNumber
print Data

The value corresponding to each tag present in the substring(example:ecNumber) is stored in between * and # (example: *2.4.1.11#).I attempted to parse the value stored for the tag ecNumber in the first substring. The output I obtain is

result='Number*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#'
{'ecNumber': '*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081'}

The desired output is

result= 'ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#'
{'ecNumber': '2.4.1.11'}

I would like to store each tag and its corresponding value.For example,

{'ecNumber': '2.4.1.11','kmValue':'0.021','kmValueMaximum':'1.25'}

You can try this:

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#' 
new_data = re.findall('(?<=^)[a-zA-Z]+(?=\*)|(?<=#)[a-zA-Z]+(?=\*)|(?<=\*)[-\d\.]+(?=#)', s)
final_data = dict([new_data[i:i+2] for i in range(0, len(new_data)-1, 2)])

Output:

{'kmValue': '0.57', 'kmValueMaximum': '1.25', 'ecNumber': '2.4.1.11'}

Despite you are asking a solution with regular expression, I would say it's much easier to use direct string operations for this problem, since the source string is well formatted.

For infomation before the first i :

print dict([i.split('*') for i in s.split('!', 1)[0].split('#') if i])

For all information in s :

print [dict([i.split('*') for i in j.split('#') if i]) for j in s.split('!') if j] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM