I currently have a list of lists of the following type:
[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]
I am trying to convert this list of lists into a python dictionary that must be written (json.dumps) into a JSON output file as follows:
[{"Person":"John Smith","Message":8},…]
How do I achieve this?
For list of lists of type -
cars_list = [[1,'Honda','red'], [2,'Toyota','white'], [3,'Mazda','blue']]
I understand that using the following code works-
cars_dict = {}
for key, car, color in cars_list:
cars_dict[key] = [car, color]
-but I'm unable to manipulate the former list of lists into the format I am trying to get because of the existing a:b, c:d format
Edit: This is the code I have written that gave me the resulting list of lists:
f = open("input.txt", "r")
# d = defaultdict(int)
keylist = []
final_use = []
for line in f:
lineslist = line.split()
nameslist = lineslist[1:3]
nameslist = [s.replace(':', '') for s in nameslist]
keylist.append(nameslist[0]+" "+nameslist[1])
# print(keylist)
d = {}
[d.__setitem__(item,1+d.get(item,0)) for item in keylist]
# print(d)
for person in d:
final_use.append(["'Person':"+str(person)+","+"'Message':"+str(d[person])])
print(final_use)
sample output of this code is the list of lists that I attached in the beginning
sample output:
[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]
The following is a sample of the data present in input.txt: (not including the entire data since it is a huge file) note: there are empty lines between entries
00:01:44 Yiyang Chen: hello
00:01:46 Junbo Sheng: good morning
00:01:46 Jiayi Lin: 1
00:01:47 Baitong Liu: yes, email me
00:01:47 Zian Fan: afternoon batch
00:01:48 Leon Luc: 1
00:01:48 Zhiqian Wang: 1
00:01:49 Jiahui Lu: 1
00:01:49 Shiming Chen: 1
00:07:47 Yanru Jiang: 1
description of what this is about: This is a sample of a zoom chat that I am trying to manipulate. I am taking this input.txt file and trying to output a JSON file that shows the name of the person and the number of chats by the person in the zoom chat in the following format: (example format)
[{"Person":"John Smith","Message":8},
{"Person":"Yiyang Chen","Message":10},
{"Person":"Junbo Sheng","Message":2}…]
I hope this is clearer now. Also, I understand my code is not very clean since I am a beginner and I hope you can help.
Thanks in advance.
You have in fact a list of lists where the inner lists contain one single string. As the format of that string is simple, you could use a regex to parse it and feed a dictionnary with it. Demo:
import re
import pprint
ll = [["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"],
["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"],
["'Person':Zian Fan,'Message':9"]]
rx = re.compile(r"\s*'Person'\s*:\s*(.*?)\s*,\s*'Message'\s*:\s*(.*)\s*$")
d = [{'Person': m.group(1), 'Message': m.group(2)}
for m in [rx.match(i[0]) for i in ll]]
pprint.pprint(d)
gives as expected:
[{'Message': '10', 'Person': 'Yiyang Chen'},
{'Message': '2', 'Person': 'Junbo Sheng'},
{'Message': '4', 'Person': 'Jiayi Lin'},
{'Message': '8', 'Person': 'Baitong Liu'},
{'Message': '9', 'Person': 'Zian Fan'}]
But after seeing the way you build the list of lists , it would be much simpler to build directly a list of dictionaries. You just have to slightly change the end of your script:
...
# print(d)
for person in d:
final_use.append({'Person': person, 'Message': d[person])})
print(final_use)
and final_use
can directly be used to generate a JSON string or file...
Here is my suggestion, using a function to transform each item of your list to desired dictionary:
l=[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]
def f(x):
x2=x[0]
x3=x2.split(',')
x4={i.split(':')[0][1:-1]:int(i.split(':')[1]) if i.split(':')[1].isdigit() else i.split(':')[1] for i in x3}
return x4
res=[f(i) for i in l]
print(res)
Output:
[{'Person': 'Yiyang Chen', 'Message': 10}, {'Person': 'Junbo Sheng', 'Message': 2}, {'Person': 'Jiayi Lin', 'Message': 4}, {'Person': 'Baitong Liu', 'Message': 8}, {'Person': 'Zian Fan', 'Message': 9}]
The main problem with your original code is that you are trying to represent structured data as a string. Then, you are subsequently trying to convert that back into usable data.
As you've encountered, this becomes quite cumbersome to deal with, since you are creating a non-standard format and trying to parse that in a subsequent step.
What you can do instead, is store the data in a structured way throughout your code.
One method, is to break the problem into two steps:
Below, I use collections.defaultdict
to keep a count of the number of messages sent by each user.
Then, I use a list comprehension to convert that into a list of dicts.
You can also clean up the data extraction a little by making use of the maxsplit
argument ofstr.split
.
import collections
counts = collections.defaultdict(int)
with open('input.txt') as f:
for line in f:
# first, remove the unwanted colon from the line
line = line.replace(':', '')
# next, split the line up (at most 3 splits)
# we "discard" the first & last fields, and keep only the middle two (first & last name)
_, first, last, _ = line.split(maxsplit=3)
# increment the number of messages for this user
# using an f-string to combine the two names into a string that can be used as a key
counts[f'{first} {last}'] += 1
# now, loop through the key-value pairs, and convert each into a dict (rather than a string representation)
result = [{'Person': k, 'Messages': v} for k, v in counts.items()]
Essentially, this version follows the same pattern as your original version, except the first part is a lot simpler & your final loop is replaced by a list comprehension that creates a list of dicts as opposed to a nested list of strings.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.