I have huge text file which I have to parse.
individual line of the file contains some text and dict. I only care about dict data.
file contain logs in the following format
my data : {"a":1, "b":2, "c": 3}
my data : {"a":23, "b": 44, "c": 565}
my_data : {"a":1233, "b": 21, "c":544}
so, from above data I am only looking for dict.
I tried with
f = open(‘text.file’,'r’)
my_dict = eval(f.read())
but it gives me error as the initial part of the line is string. So, my question is what is the best way to extract dict from the file.
It looks like you've got some delimator between the strings, so str.split() is your friend there.
Afterwards, consider using the AST module instead of the eval. It presents less of a security risk than blindly eval'ing.
>>>import ast
>>> a = ast.literal_eval("{'a':1}")
>>> type(a)
<class 'dict'>
>>> a
{'a': 1}
eval is bad
here's what I would do:
import json
dicts = []
with open('text.file', 'r') as f:
for line in f.readlines():
if not line: continue
_, dict_str = line.split(':', 1)
dict_str = dict_str.strip()
dict = json.load(dict_str)
dicts.append(dict)
You can use the re
module
import re
text = """my data : {"a":1, "b":2, "c": 3}
my data : {"a":23, "b": 44, "c": 565}
my_data : {"a":1233, "b": 21, "c":544}"""
dict = re.compile(r"{[^}]*?}", re.I)
matches = dict.finditer(text)
for match in matches:
my_dict = eval(match.group())
print(my_dict)
which gives you
{'b': 2, 'c': 3, 'a': 1}
{'b': 44, 'c': 565, 'a': 23}
{'b': 21, 'c': 544, 'a': 1233}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.