简体   繁体   中英

How do I convert a list of lists to a python dictionary of the following format?

I currently have a list of lists of the following type:

[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]

I am trying to convert this list of lists into a python dictionary that must be written (json.dumps) into a JSON output file as follows:

[{"Person":"John Smith","Message":8},…]

How do I achieve this?

For list of lists of type -

cars_list = [[1,'Honda','red'], [2,'Toyota','white'], [3,'Mazda','blue']]

I understand that using the following code works-

cars_dict = {}

for key, car, color in cars_list:
    cars_dict[key] = [car, color]

-but I'm unable to manipulate the former list of lists into the format I am trying to get because of the existing a:b, c:d format

Edit: This is the code I have written that gave me the resulting list of lists:

f = open("input.txt", "r")
# d = defaultdict(int)
keylist = []
final_use = []
for line in f:
    lineslist = line.split()
    nameslist = lineslist[1:3]
    nameslist = [s.replace(':', '') for s in nameslist]
    keylist.append(nameslist[0]+" "+nameslist[1])
# print(keylist)

    d = {}
    [d.__setitem__(item,1+d.get(item,0)) for item in keylist]
# print(d)

for person in d:
    
    final_use.append(["'Person':"+str(person)+","+"'Message':"+str(d[person])])
print(final_use)

sample output of this code is the list of lists that I attached in the beginning

sample output:

[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]

The following is a sample of the data present in input.txt: (not including the entire data since it is a huge file) note: there are empty lines between entries

00:01:44 Yiyang Chen: hello

00:01:46 Junbo Sheng: good morning

00:01:46 Jiayi Lin: 1

00:01:47 Baitong Liu: yes, email me

00:01:47 Zian Fan: afternoon batch

00:01:48 Leon Luc: 1

00:01:48 Zhiqian Wang: 1

00:01:49 Jiahui Lu: 1

00:01:49 Shiming Chen: 1

00:07:47 Yanru Jiang: 1

description of what this is about: This is a sample of a zoom chat that I am trying to manipulate. I am taking this input.txt file and trying to output a JSON file that shows the name of the person and the number of chats by the person in the zoom chat in the following format: (example format)

[{"Person":"John Smith","Message":8},
 {"Person":"Yiyang Chen","Message":10},
 {"Person":"Junbo Sheng","Message":2}…]

I hope this is clearer now. Also, I understand my code is not very clean since I am a beginner and I hope you can help.

Thanks in advance.

You have in fact a list of lists where the inner lists contain one single string. As the format of that string is simple, you could use a regex to parse it and feed a dictionnary with it. Demo:

import re
import pprint

ll = [["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"],
      ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"],
      ["'Person':Zian Fan,'Message':9"]]
rx = re.compile(r"\s*'Person'\s*:\s*(.*?)\s*,\s*'Message'\s*:\s*(.*)\s*$")
d = [{'Person': m.group(1), 'Message': m.group(2)}
     for m in [rx.match(i[0]) for i in ll]]
pprint.pprint(d)

gives as expected:

[{'Message': '10', 'Person': 'Yiyang Chen'},
 {'Message': '2', 'Person': 'Junbo Sheng'},
 {'Message': '4', 'Person': 'Jiayi Lin'},
 {'Message': '8', 'Person': 'Baitong Liu'},
 {'Message': '9', 'Person': 'Zian Fan'}]

But after seeing the way you build the list of lists , it would be much simpler to build directly a list of dictionaries. You just have to slightly change the end of your script:

...
# print(d)

for person in d:
    
    final_use.append({'Person': person, 'Message': d[person])})
print(final_use)

and final_use can directly be used to generate a JSON string or file...

Here is my suggestion, using a function to transform each item of your list to desired dictionary:

l=[["'Person':Yiyang Chen,'Message':10"], ["'Person':Junbo Sheng,'Message':2"], ["'Person':Jiayi Lin,'Message':4"], ["'Person':Baitong Liu,'Message':8"], ["'Person':Zian Fan,'Message':9"]]

def f(x):
    x2=x[0]
    x3=x2.split(',')
    x4={i.split(':')[0][1:-1]:int(i.split(':')[1]) if i.split(':')[1].isdigit() else i.split(':')[1] for i in x3}
    return x4

res=[f(i) for i in l]

print(res)

Output:

[{'Person': 'Yiyang Chen', 'Message': 10}, {'Person': 'Junbo Sheng', 'Message': 2}, {'Person': 'Jiayi Lin', 'Message': 4}, {'Person': 'Baitong Liu', 'Message': 8}, {'Person': 'Zian Fan', 'Message': 9}]

The main problem with your original code is that you are trying to represent structured data as a string. Then, you are subsequently trying to convert that back into usable data.

As you've encountered, this becomes quite cumbersome to deal with, since you are creating a non-standard format and trying to parse that in a subsequent step.


What you can do instead, is store the data in a structured way throughout your code.

One method, is to break the problem into two steps:

  1. Store the count of messages as a dictionary, mapping each person's name to the total count of messages.
  2. Convert that to the format you want -- a list of dicts.

Below, I use collections.defaultdict to keep a count of the number of messages sent by each user.

Then, I use a list comprehension to convert that into a list of dicts.

You can also clean up the data extraction a little by making use of the maxsplit argument ofstr.split .

import collections

counts = collections.defaultdict(int)

with open('input.txt') as f:
    for line in f:
        # first, remove the unwanted colon from the line
        line = line.replace(':', '')
        
        # next, split the line up (at most 3 splits)
        # we "discard" the first & last fields, and keep only the middle two (first & last name)
        _, first, last, _ = line.split(maxsplit=3)

        # increment the number of messages for this user
        # using an f-string to combine the two names into a string that can be used as a key
        counts[f'{first} {last}'] += 1

# now, loop through the key-value pairs, and convert each into a dict (rather than a string representation)
result = [{'Person': k, 'Messages': v} for k, v in counts.items()]

Essentially, this version follows the same pattern as your original version, except the first part is a lot simpler & your final loop is replaced by a list comprehension that creates a list of dicts as opposed to a nested list of strings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM