[英]how to preserve types when using pyparsing and returning the result as a dictionary
I'm a newbie in pyparsing and hope somebody can help me.The type of text I am trying to parse is of the following structure: 我是pyparsing的新手,希望有人能帮助我。我要解析的文本类型具有以下结构:
I have key=value pairs in a line that can have one or more pairs. 我在一行中可以有一个或多个对的key = value对。 The values can be of many types such as string, int, float, list, dictionary. 值可以有许多类型,例如字符串,整数,浮点数,列表,字典。 The key is always a string. 密钥始终是字符串。 Example of a line with 4 pairs: 4对线的示例:
mode='clip' clipzeros=True field='1331+0444=3C286' clipminmax=[0,1.2] mode ='clip'clipzeros = True字段='1331 + 0444 = 3C286'clipminmax = [0,1.2]
So I have defined my grammar and parser to be: 因此,我将语法和解析器定义为:
import pyparsing as pp
key = pp.Word(pp.alphas+"_")
separator = pp. Literal("=").suppress()
value = pp.Word(pp.printables)
pair = key+separator+value
line = pp.OneOrMore(pair)
mytest = "mode='clip' clipzeros=True field='1331+0444=3C286' clipminmax=[0,1.2]"
res = line.parseString(mytest)
print res
It returns this: 它返回以下内容:
['mode', "'clip'", 'clipzeros', 'True', 'field', "'1331+0444=3C286'", 'clipminmax', '[0,1.2]'] ['mode','clip','clipzeros','True','field',''1331 + 0444 = 3C286','clipminmax','[0,1.2]']
Two things I want to have as results: 我希望得到两件事:
I'd like to have the results as a dictionary such as: {"mode":"clip", "clipzeros":True, "field":"1331+0444=3C286", "clipminmax":[0,1.2]} 我想将结果作为字典,例如:{“ mode”:“ clip”,“ clipzeros”:True,“ field”:“ 1331 + 0444 = 3C286”,“ clipminmax”:[0,1.2] }
I'd like to keep the types of the values in the resulting dictionary. 我想在结果字典中保留值的类型。 For example: the type of value clipzeros is a boolean. 例如:值clipzeros的类型是一个布尔值。 The type of value clipminmax is a list. clipminmax值的类型是一个列表。
Is this at all possible in pyparsing? pyparsing这完全可能吗?
Thanks a lot for any help. 非常感谢您的帮助。
Sandra 桑德拉
Try using eval() to get your type. 尝试使用eval()获取类型。
import pyparsing as pp
key = pp.Word(pp.alphas+"_")
separator = pp. Literal("=").suppress()
value = pp.Word(pp.printables)
pair = key+separator+value
line = pp.OneOrMore(pair)
mytest = "mode='clip' clipzeros=True field='1331+0444=3C286' clipminmax=[0,1.2]"
res = line.parseString(mytest)
mydict = dict(zip(res[::2],[eval(x) for x in res[1::2]])
Yields: 产量:
{'field': '1331+0444=3C286', 'mode': 'clip', 'clipzeros': True, 'clipminmax': [0, 1.2]}
Another Example: 另一个例子:
res = ['mode', "'clip'", 'clipzeros', 'True', 'field', "'R RQT'", 'clipminmax', '[0,1.2]']
mydict = dict(zip(res[::2],[eval(x) for x in res[1::2]]))
print mydict
Yields: 产量:
{'field': 'R RQT', 'mode': 'clip', 'clipzeros': True, 'clipminmax': [0, 1.2]}
ALTERNATIVE to pyparser (I don't have that module): 替代pyparser(我没有那个模块):
class Parser():
def __init__(self,primarydivider,secondarydivider):
self.prime = primarydivider
self.second = secondarydivider
def parse(self,string):
res = self.initialsplit(string)
new = []
for entry in res:
if self.second not in entry:
new[-1] += ' ' + entry
else:
new.append(entry)
return dict((entry[0],eval(entry[1])) for entry in [entry.split(self.second) for entry in new])
def initialsplit(self,string):
return string.split(self.prime)
mytest = "mode='clip' clipzeros=True field='AEF D' clipminmax=[0,1.2]"
myParser = Parser(' ', '=')
parsed = myParser.parse(mytest)
print parsed
Yields 产量
{'field': 'AEF D', 'mode': 'clip', 'clipzeros': True, 'clipminmax': [0, 1.2]}
OP's Comments Edit: OP的评论编辑:
Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> mytest = "mode='clip' clipzeros=True field='R RQT' clipminmax=[0,1.2]"
>>> print mytest
mode='clip' clipzeros=True field='R RQT' clipminmax=[0,1.2]
>>>
Rather than use something as generic as Word(printables)
, I'd suggest defining specific expressions for different types of value literals: 我建议不要使用与Word(printables)
一样通用的东西,而建议为不同类型的值文字定义特定的表达式:
from pyparsing import *
# what are the different kinds of values you can get?
int_literal = Combine(Optional('-') + Word(nums))
float_literal = Regex(r'\d+\.\d*')
string_literal = quotedString
bool_literal = Keyword("True") | Keyword("False")
none_literal = Keyword("None")
list_literal = originalTextFor(nestedExpr('[',']'))
dict_literal = originalTextFor(nestedExpr('{','}'))
# define an overall value expression, of the different types of values
value = (float_literal | int_literal | bool_literal | none_literal |
string_literal | list_literal | dict_literal)
key = Word(alphas + '_')
def evalValue(tokens):
import ast
# ast.literal_eval is safer than global eval()
return [ast.literal_eval(tokens[0])]
pair = Group(key("key") + '=' + value.setParseAction(evalValue)("value"))
line = OneOrMore(pair)
Now parse your sample: 现在解析您的样本:
sample = """mode='clip' clipzeros=True field='1331+0444=3C286' clipminmax=[0,1.2]"""
result = line.parseString(sample)
for r in result:
print r.dump()
print r.key
print r.value
print
Prints: 印刷品:
['mode', '=', 'clip']
- key: mode
- value: clip
mode
clip
['clipzeros', '=', True]
- key: clipzeros
- value: True
clipzeros
True
['field', '=', '1331+0444=3C286']
- key: field
- value: 1331+0444=3C286
field
1331+0444=3C286
['clipminmax', '=', [0, 1.2]]
- key: clipminmax
- value: [0, 1.2]
clipminmax
[0, 1.2]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.