Experts,
I have written a program to convert the string into dictionary. I'm able to achieve the desired result but i doubt if this is a pythonic way. Would like to hear suggestions on the same.
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
I have split using (:) and have stored in dictionary. Here Cities and HeadQuarters contains another dictionary for which i have written code like this.
if k == 'cities' :
D[k] = {}
continue
elif k == 'HeadQuarters':
D[k] = {}
continue
elif k == 'LA' :
if D.has_key('cities'):
if D['cities'].get(k) is None:
D['cities'][k] = v
if D.has_key('HeadQuarters'):
if D['HeadQuarters'].get(k) is None:
D['HeadQuarters'][k] = v
elif k == 'NY' :
if D.has_key('cities'):
if D['cities'].get(k) is None:
D['cities'][k] = v
if D.has_key('HeadQuarters'):
if D['HeadQuarters'].get(k) is None:
D['HeadQuarters'][k] = v
else:
D[k]= v
Not sure if pythonic
x = re.split(r':|\n',txt)[1:-1]
x = list(map(lambda x: x.rstrip(),x))
x = (zip(x[::2], x[1::2]))
d = {}
for i in range(len(x)):
if not x[i][0].startswith(' '):
if x[i][1] != '':
d[x[i][0]] = x[i][1]
else:
t = x[i][0]
tmp = {}
i+=1
while x[i][0].startswith(' '):
tmp[x[i][0].strip()] = x[i][1]
i+=1
d[t] = tmp
print d
output
{'Country': ' USA', 'cities': {'NY': ' New York', 'LA': ' Los Angeles'}, 'name': ' xxxx', 'desgination': ' yyyy', 'HeadQuarters': {'NY': ' NY', 'LA': ' LA'}}
You can use the split
method here, a little recursion for your sub-dictionaries, and an assumption that your sub-dictionaries start with a tab ( \\t
) or four spaces:
def txt_to_dict(txt):
data = {}
lines = txt.split('\n')
i = 0
while i < len(lines):
try:
key,val = txt.split(':')
except ValueError:
# print "Invalid row format"
i += 1
continue
key = key.strip()
val = val.strip()
if len(val) == 0:
i += 1
sub = ""
while lines[i].startswith('\t') or lines[i].startswith(' '):
sub += lines[i] + '\n'
i += 1
data[key] = txt_to_dict(sub[:-1]) # remove last newline character
else:
data[key] = val
i += 1
return data
And then you would just call it on your variable txt
as:
>>> print txt_to_dict(txt)
{'Country': 'USA', 'cities': {'NY': 'New York', 'LA': 'Los Angeles'}, 'name': 'xxxx', 'desgination': 'yyyy', 'HeadQuarters': {'NY': 'NY', 'LA': 'LA'}}
Sample output shown above. Creates the sub-dictionaries properly.
Added some error handling.
This produces the same output as your code. It was arrived at primarily by refactoring what you had and applying a few common Python idioms.
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
D = {} # added to test code
for line in (line for line in txt.splitlines() if line): # "
k, _, v = [s.strip() for s in line.partition(':')] # "
if k in {'cities', 'HeadQuarters'}:
D[k] = {}
continue
elif k in {'LA', 'NY'}:
for k2 in (x for x in ('cities', 'HeadQuarters') if x in D):
if k not in D[k2]:
D[k2][k] = v
else:
D[k]= v
import pprint
pprint.pprint(D)
Output:
{'Country': 'USA',
'HeadQuarters': {'LA': 'LA', 'NY': 'NY'},
'cities': {'LA': 'Los Angeles', 'NY': 'New York'},
'desgination': 'yyyy',
'name': 'xxxx'}
You could use an existing yaml parser ( PyYAML
package ):
import yaml # $ pip install pyyaml
data = yaml.safe_load(txt)
{'Country': 'USA',
'HeadQuarters': {'LA': 'LA', 'NY': 'NY'},
'cities': {'LA': 'Los Angeles', 'NY': 'New York'},
'desgination': 'yyyy',
'name': 'xxxx'}
The parser accepts your input as is but to make it more conformant yaml
, it requires small modifications :
---
Country: USA
HeadQuarters:
LA: LA
NY: NY
cities:
LA: "Los Angeles"
NY: "New York"
desgination: yyyy
name: xxxx
This works
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
di = {}
for line in txt.split('\n'):
if len(line)> 1: di[line.split(':')[0].strip()]= line.split(':')[1].strip()
print di # {'name': 'xxxx', 'desgination': 'yyyy', 'LA': 'LA', 'Country': 'USA', 'HeadQuarters': '', 'NY': 'NY', 'cities': ''}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.