使用正则表达式解析Python中的子字符串

Question

I am trying to parse a substring using re. 我正在尝试使用re解析子字符串。

From the string present in variable s ,I would like to split the string present till the first ! 从存在于变量s中的字符串开始，我想将存在的字符串拆分到第一个！ (the string stored in s has two ! ) and store it as a substring.From this substring(stored in variable result ), I wish to parse another substring. （存储在s中的字符串有两个！）并将其存储为一个子字符串。我希望从这个子字符串（存储在变量result中 ）解析另一个子字符串。

Here is the code, 这是代码，

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#!'


Data={}

result = re.search('%s(.*)%s' % ('ec', '!'), s).group(1)
print result
ecNumber = re.search('%s(.*)%s' % ('Number*', '#kmValue*'), result).group(1)
Data["ecNumber"]=ecNumber
print Data

The value corresponding to each tag present in the substring(example:ecNumber) is stored in between * and # (example: *2.4.1.11#).I attempted to parse the value stored for the tag ecNumber in the first substring. 与子字符串中存在的每个标签相对应的值（示例：ecNumber）存储在*和＃之间（示例：* 2.4.1.11＃）。我试图解析为第一个子字符串中的ecNumber标签存储的值。 The output I obtain is 我得到的输出是

result='Number*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#'
{'ecNumber': '*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081'}

The desired output is 所需的输出是

result= 'ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#'
{'ecNumber': '2.4.1.11'}

I would like to store each tag and its corresponding value.For example, 我想存储每个标签及其对应的值。例如，

{'ecNumber': '2.4.1.11','kmValue':'0.021','kmValueMaximum':'1.25'}

Answer 1

You can try this: 您可以尝试以下方法：

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#' 
new_data = re.findall('(?<=^)[a-zA-Z]+(?=\*)|(?<=#)[a-zA-Z]+(?=\*)|(?<=\*)[-\d\.]+(?=#)', s)
final_data = dict([new_data[i:i+2] for i in range(0, len(new_data)-1, 2)])

Output: 输出：

{'kmValue': '0.57', 'kmValueMaximum': '1.25', 'ecNumber': '2.4.1.11'}

Answer 2

Despite you are asking a solution with regular expression, I would say it's much easier to use direct string operations for this problem, since the source string is well formatted. 尽管您正在问一个带有正则表达式的解决方案，但我会说使用直接字符串操作解决此问题要容易得多，因为源字符串格式正确。

For infomation before the first i : 对于第一个i之前的信息：

print dict([i.split('*') for i in s.split('!', 1)[0].split('#') if i])

For all information in s : 有关s所有信息：

print [dict([i.split('*') for i in j.split('#') if i]) for j in s.split('!') if j]

使用正则表达式解析Python中的子字符串

问题描述

2 个解决方案

解决方案1
1 2017-12-08 02:49:09

解决方案2
1 已采纳 2017-12-08 03:56:08

使用正则表达式解析Python中的子字符串

问题描述

2 个解决方案

解决方案1 1 2017-12-08 02:49:09

解决方案2 1 已采纳 2017-12-08 03:56:08

解决方案1
1 2017-12-08 02:49:09

解决方案2
1 已采纳 2017-12-08 03:56:08