简体   繁体   中英

pythonic way of parsing a string into a dictionary with dict comprehension

For a given string

key = "test: abc :bcd,ef:1923:g, x : y : z\nkey2 :1st:second\n  etc :values:2,3,4:..."

I would like to parse the string to store into a dict with the first token as key and the rest elements as a value list, something like the following result:

{'test': ['abc', 'bcd,ef', '1923', 'g, x', 'y', 'z'], 'key2': ['1st', 'second'], 'etc': ['values', '2,3,4', '...']}

I have


def parseLine(line):
    return list(map(str.strip, line.split(":")))

result = {parseLine(line)[0]:parseLine(line)[1:] for line in str_txt.split('\n')}
print(result)

But in the expression of the dict comprehensions, the function parseLine is invoked twice to set key and value for the dict as parseLine(line)[0]:parseLine(line)[1:] .

Is there a better way to re-write the dict comprehensions?

{lst[0]:lst[1:] for lst in map(lambda s: list(map(str.strip, s.split(":"))), key.split('\n'))}

It gives:

{'test': ['abc', 'bcd,ef', '1923', 'g, x', 'y', 'z'],
 'key2': ['1st', 'second'],
 'etc': ['values', '2,3,4', '...']}

You can use map inside the comprehension to apply the function, and then destructure the results.

result = {k: v for k, *v in map(parseLine, str_txt.split('\n'))}

Note also that if you're using parseLine only for this, you can rewrite it without the conversion to list :

def parseLine(line):
    return map(str.strip, line.split(":"))
import re

s = "test: abc :bcd,ef:1923:g, x : y : z\nkey2 :1st:second\n  etc :values:2,3,4:..."

s = re.sub(r'[^a-z0-9,]',' ',s)

print ({ x.split()[0]:x.split()[1:] for x in s.split("\n") })

Output:

{'test': ['abc', 'bcd,ef', '1923', 'g,', 'x', 'y', 'z'], 'key2': ['1st', 'second'], 'etc': ['values', '2,3,4']}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM