简体   繁体   中英

Convert string tuples to dict

I have malformed string:

a = '(a,1.0),(b,6.0),(c,10.0)'

I need dict :

d = {'a':1.0, 'b':6.0, 'c':10.0}

I try:

print (ast.literal_eval(a))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000F67E828>

Then I try replace chars to 'string dict' , it is ugly and does not work:

b = a.replace(',(','|{').replace(',',' : ')
     .replace('|',', ').replace('(','{').replace(')','}')
print (b)
{a : 1.0}, {b : 6.0}, {c : 10.0}

print (ast.literal_eval(b))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000C2EA588>

What do you do? Something missing? Is possible use regex ?

No need for regexes, if your string is in this format.

>>> a = '(a,1.0),(b,6.0),(c,10.0)'
>>> d = dict([x.split(',') for x in a[1:-1].split('),(')])
>>> print(d)
{'c': '10.0', 'a': '1.0', 'b': '6.0'}

We remove the first opening parantheses and last closing parantheses to get the key-value pairs by splitting on ),( . The pairs can then be split on the comma.

To cast to float, the list comprehension gets a little longer:

d = dict([(a, float(b)) for (a, b) in [x.split(',') for x in a[1:-1].split('),(')]])

If there are always 2 comma-separated values inside parentheses and the second is of a float type, you may use

import re
s = '(a,1.0),(b,6.0),(c,10.0)'
print(dict(map(lambda (w, m): (w, float(m)), [(x, y) for x, y in re.findall(r'\(([^),]+),([^)]*)\)', s) ])))

See the Python demo and the (quite generic) regex demo . This pattern just matches a ( , then 0+ chars other than a comma and ) capturing into Group 1, then a comma is matched, then any 0+ chars other than ) (captured into Group 2) and a ) .

As the pattern above is suitable when you have pre-validated data, the regex can be restricted for your current data as

r'\((\w+),(\d*\.?\d+)\)'

See the regex demo

Details :

  • \\( - a literal (
  • (\\w+) - Capturing group 1: one or more word (letter/digit/ _ ) chars
  • , - a comma
  • (\\d*\\.?\\d+) - a common integer/float regex: zero or more digits, an optional . (decimal separator) and 1+ digits
  • \\) - a literal closing parenthesis.

Given the string has the above stated format, you could use regex substitution with backrefs :

import re

a = '(a,1.0),(b,6.0),(c,10.0)'
a_fix = re.sub(r'\((\w+),', r"('\1',",a)

So you look for a pattern (x, (with x a sequence of \\w s and you substitute it into ('x', . The result is then:

# result
a_fix == "('a',1.0),('b',6.0),('c',10.0)"

and then parse a_fix and convert it to a dict :

result = dict(ast.literal_eval(a_fix))

The result in then:

>>> dict(ast.literal_eval(a_fix))
{'b': 6.0, 'c': 10.0, 'a': 1.0}

the reason why eval() dose not work is the a, b, c are not defined, we can define those with it's string form and eval will get that string form to use

In [11]: text = '(a,1.0),(b,6.0),(c,10.0)'

In [12]: a, b, c = 'a', 'b', 'c'

In [13]: eval(text)
Out[13]: (('a', 1.0), ('b', 6.0), ('c', 10.0))

In [14]: dict(eval(text))
Out[14]: {'a': 1.0, 'b': 6.0, 'c': 10.0}

to do this in regex way:

In [21]: re.sub(r'\((.+?),', r'("\1",', text)
Out[21]: '("a",1.0),("b",6.0),("c",10.0)'
In [22]: eval(_)
Out[22]: (('a', 1.0), ('b', 6.0), ('c', 10.0))

In [23]: dict(_)
Out[23]: {'a': 1.0, 'b': 6.0, 'c': 10.0}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM