简体   繁体   English

将字符串元组转换为dict

[英]Convert string tuples to dict

I have malformed string: 我的字符串格式错误:

a = '(a,1.0),(b,6.0),(c,10.0)'

I need dict : 我需要dict

d = {'a':1.0, 'b':6.0, 'c':10.0}

I try: 我尝试:

print (ast.literal_eval(a))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000F67E828>

Then I try replace chars to 'string dict' , it is ugly and does not work: 然后我尝试将chars替换为'string dict' ,它很难看并且不起作用:

b = a.replace(',(','|{').replace(',',' : ')
     .replace('|',', ').replace('(','{').replace(')','}')
print (b)
{a : 1.0}, {b : 6.0}, {c : 10.0}

print (ast.literal_eval(b))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000C2EA588>

What do you do? 你是做什么? Something missing? 有什么遗失? Is possible use regex ? 可以使用regex吗?

No need for regexes, if your string is in this format. 如果你的字符串是这种格式,则不需要正则表达式。

>>> a = '(a,1.0),(b,6.0),(c,10.0)'
>>> d = dict([x.split(',') for x in a[1:-1].split('),(')])
>>> print(d)
{'c': '10.0', 'a': '1.0', 'b': '6.0'}

We remove the first opening parantheses and last closing parantheses to get the key-value pairs by splitting on ),( . The pairs can then be split on the comma. 我们删除了第一个打开的parantheses和最后关闭的parantheses以通过拆分得到键值对),( 。然后可以在逗号上拆分对。

To cast to float, the list comprehension gets a little longer: 要转换为浮动,列表理解会变得更长一些:

d = dict([(a, float(b)) for (a, b) in [x.split(',') for x in a[1:-1].split('),(')]])

If there are always 2 comma-separated values inside parentheses and the second is of a float type, you may use 如果括号内总是有2个以逗号分隔的值,而第二个是浮点类型,则可以使用

import re
s = '(a,1.0),(b,6.0),(c,10.0)'
print(dict(map(lambda (w, m): (w, float(m)), [(x, y) for x, y in re.findall(r'\(([^),]+),([^)]*)\)', s) ])))

See the Python demo and the (quite generic) regex demo . 查看Python演示和(非常通用的) 正则表达式演示 This pattern just matches a ( , then 0+ chars other than a comma and ) capturing into Group 1, then a comma is matched, then any 0+ chars other than ) (captured into Group 2) and a ) . 这种模式只匹配( ,然后0+逗号以外和字符)捕捉到1个组,然后逗号匹配,则任何0+字符以外) (取入第2组)和一个)

As the pattern above is suitable when you have pre-validated data, the regex can be restricted for your current data as 由于上述模式适用于预先验证的数据,因此可以将当前数据的正则表达式限制为

r'\((\w+),(\d*\.?\d+)\)'

See the regex demo 请参阅正则表达式演示

Details : 细节

  • \\( - a literal ( \\( - 文字(
  • (\\w+) - Capturing group 1: one or more word (letter/digit/ _ ) chars (\\w+) - 捕获组1:一个或多个单词(字母/数字/ _ )字符
  • , - a comma , - 一个逗号
  • (\\d*\\.?\\d+) - a common integer/float regex: zero or more digits, an optional . (\\d*\\.?\\d+) - 一个常见的整数/浮动正则表达式:零个或多个数字,一个可选的. (decimal separator) and 1+ digits (小数点分隔符)和1+位数
  • \\) - a literal closing parenthesis. \\) - 字面右括号。

Given the string has the above stated format, you could use regex substitution with backrefs : 鉴于字符串具有上述格式,您可以使用backrefs的正则表达式替换:

import re

a = '(a,1.0),(b,6.0),(c,10.0)'
a_fix = re.sub(r'\((\w+),', r"('\1',",a)

So you look for a pattern (x, (with x a sequence of \\w s and you substitute it into ('x', . The result is then: 所以你要寻找一个模式(x,x是一个序列的\\w s,你用它代入('x', \\w结果是:

# result
a_fix == "('a',1.0),('b',6.0),('c',10.0)"

and then parse a_fix and convert it to a dict : 然后解析a_fix并将其转换为dict

result = dict(ast.literal_eval(a_fix))

The result in then: 那么结果是:

>>> dict(ast.literal_eval(a_fix))
{'b': 6.0, 'c': 10.0, 'a': 1.0}

the reason why eval() dose not work is the a, b, c are not defined, we can define those with it's string form and eval will get that string form to use eval()不起作用的原因是a, b, c没有定义,我们可以用它的字符串形式定义那些,eval会得到那个字符串形式来使用

In [11]: text = '(a,1.0),(b,6.0),(c,10.0)'

In [12]: a, b, c = 'a', 'b', 'c'

In [13]: eval(text)
Out[13]: (('a', 1.0), ('b', 6.0), ('c', 10.0))

In [14]: dict(eval(text))
Out[14]: {'a': 1.0, 'b': 6.0, 'c': 10.0}

to do this in regex way: 以正则表达式方式执行此操作:

In [21]: re.sub(r'\((.+?),', r'("\1",', text)
Out[21]: '("a",1.0),("b",6.0),("c",10.0)'
In [22]: eval(_)
Out[22]: (('a', 1.0), ('b', 6.0), ('c', 10.0))

In [23]: dict(_)
Out[23]: {'a': 1.0, 'b': 6.0, 'c': 10.0}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM