[英]Merge the json objects with same key value pair in a file using python
I have a file with objects in it like below.我有一个包含对象的文件,如下所示。
Eg: Input.txt例如:Input.txt
1. {"Cp": "1000", "Af": "CBS", "Bp": "150", "Vt": "channel", "Ti": "Q2", "Cs": "K11HE-D", "Tg": "BROADCAST<>LOCAL<>HD", "Fd": "dish#K11HE-D", "Pi": "CHAF2", "Gi": "RV1688668060"}
2. {"Cp": "1000", "Af": "CBS", "Bp": "150", "Vt": "channel", "Ti": "Q2", "Cs": "K08JV-D", "Tg": "BROADCAST<>LOCAL<>HD", "Fd": "dish#K08JV-D", "Pi": "CHAF2", "Gi": "RV1714277379"}
3. {"Cp": "1000", "Af": "CBS", "Bp": "150", "Vt": "channel", "Ti": "ABCD", "Cs": "K20LT-D", "Tg": "BROADCAST<>LOCAL<>HD", "Fd": "dish#K20LT-D", "Pi": "CHAF2", "Gi": "RV1714278093"}
4. {"Cp": "1000", "Af": "CBS", "Bp": "150", "Vt": "channel", "Ti": "Q2", "Cs": "K08OW-D", "Tg": "BROADCAST<>LOCAL<>HD", "Fd": "dish#K08OW-D", "Pi": "CHAF2", "Gi": "RV1714277380"}
The file contains thousands of rows.该文件包含数千行。
I want to group all those json objects in the file, which has the same value for the key " Ti ".我想将文件中的所有这些 json 对象分组,这些对象对于键“ Ti ”具有相同的值。
Below is an example to elaborate more on my requirement.下面是一个更详细地说明我的要求的例子。
You can see from the sample file above, there are 3 lines with the same value of for key "Ti".您可以从上面的示例文件中看到,对于键“Ti”,有 3 行具有相同的值。 That is line 1, 2 and 4. They have all the value for "Ti" as "Q2".
那是第 1、2 和 4 行。它们将“Ti”的所有值都作为“Q2”。
I need a way to join those JSON objects, and I want to create an output file, that looks like below.我需要一种方法来连接这些 JSON 对象,并且我想创建一个输出文件,如下所示。
Eg: Output.txt例如:输出.txt
1. {"Cp": "[1000, 1000, 1000]", "Af": "['CBS', 'CBS', 'CBS']", "Bp": "[150, 150, 150]", "Vt": "['channel', 'channel', 'channel']", "Ti": "['Q2', 'Q2', 'Q2']", "Cs": "['K11HE-D', 'K08JV-D', 'K08OW-D' ]", "Tg": "['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD, 'BROADCAST<>LOCAL<>HD]", "Fd": "['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D']", "Pi": "['CHAF2','CHAF2','CHAF2']", "Gi": "['RV1688668060', 'RV1714277379', 'RV1714277380']"}
2. {"Cp": "[1000, 1000, 1000]", "Af": "['CBS', 'CBS', 'CBS']", "Bp": "[150, 150, 150]", "Vt": "['channel', 'channel', 'channel']", "Ti": "['Q2', 'Q2', 'Q2']", "Cs": "['K11HE-D', 'K08JV-D', 'K08OW-D' ]", "Tg": "['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD, 'BROADCAST<>LOCAL<>HD]", "Fd": "['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D']", "Pi": "['CHAF2','CHAF2','CHAF2']", "Gi": "['RV1688668060', 'RV1714277379', 'RV1714277380']"}
3. {"Cp": "1000", "Af": "CBS", "Bp": "150", "Vt": "channel", "Ti": "ABCD", "Cs": "K20LT-D", "Tg": "BROADCAST<>LOCAL<>HD", "Fd": "dish#K20LT-D", "Pi": "CHAF2", "Gi": "RV1714278093"}
4. {"Cp": "[1000, 1000, 1000]", "Af": "['CBS', 'CBS', 'CBS']", "Bp": "[150, 150, 150]", "Vt": "['channel', 'channel', 'channel']", "Ti": "['Q2', 'Q2', 'Q2']", "Cs": "['K11HE-D', 'K08JV-D', 'K08OW-D' ]", "Tg": "['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD, 'BROADCAST<>LOCAL<>HD]", "Fd": "['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D']", "Pi": "['CHAF2','CHAF2','CHAF2']", "Gi": "['RV1688668060', 'RV1714277379', 'RV1714277380']"}
Please let me know, how can I achieve this.请让我知道,我怎样才能做到这一点。
You need to:你需要:
import re
raw_data = open('test.txt', 'r')
data_list = raw_data.read().splitlines()
data_list = list(filter(None, data_list))
# create list of Ti values
ti_list = []
for item in data_list:
number = re.search('\d+\.', item).group(0)
row = re.sub('\d+\. ', '', item)
row_dictionary = eval(row)
ti_list.append(row_dictionary.get('Ti'))
# collect data into new dictionary
data = {}
i = 1
for ti in ti_list:
raw = {}
for item in data_list:
number = re.search('\d+\.', item).group(0)
row = re.sub('\d+\. ', '', item)
row_dictionary = eval(row)
if row_dictionary.get('Ti') == ti:
for key, value in row_dictionary.items():
raw.setdefault(key, []).append(value)
data[str(i)+'.'] = raw
i += 1
Output:
输出:
1. {'Cp': ['1000', '1000', '1000'], 'Af': ['CBS', 'CBS', 'CBS'], 'Bp': ['150', '150', '150'], 'Vt': ['channel', 'channel', 'channel'], 'Ti': ['Q2', 'Q2', 'Q2'], 'Cs': ['K11HE-D', 'K08JV-D', 'K08OW-D'], 'Tg': ['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD'], 'Fd': ['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D'], 'Pi': ['CHAF2', 'CHAF2', 'CHAF2'], 'Gi': ['RV1688668060', 'RV1714277379', 'RV1714277380']}
2. {'Cp': ['1000', '1000', '1000'], 'Af': ['CBS', 'CBS', 'CBS'], 'Bp': ['150', '150', '150'], 'Vt': ['channel', 'channel', 'channel'], 'Ti': ['Q2', 'Q2', 'Q2'], 'Cs': ['K11HE-D', 'K08JV-D', 'K08OW-D'], 'Tg': ['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD'], 'Fd': ['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D'], 'Pi': ['CHAF2', 'CHAF2', 'CHAF2'], 'Gi': ['RV1688668060', 'RV1714277379', 'RV1714277380']}
3. {'Cp': ['1000'], 'Af': ['CBS'], 'Bp': ['150'], 'Vt': ['channel'], 'Ti': ['ABCD'], 'Cs': ['K20LT-D'], 'Tg': ['BROADCAST<>LOCAL<>HD'], 'Fd': ['dish#K20LT-D'], 'Pi': ['CHAF2'], 'Gi': ['RV1714278093']}
4. {'Cp': ['1000', '1000', '1000'], 'Af': ['CBS', 'CBS', 'CBS'], 'Bp': ['150', '150', '150'], 'Vt': ['channel', 'channel', 'channel'], 'Ti': ['Q2', 'Q2', 'Q2'], 'Cs': ['K11HE-D', 'K08JV-D', 'K08OW-D'], 'Tg': ['BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD', 'BROADCAST<>LOCAL<>HD'], 'Fd': ['dish#K11HE-D', 'dish#K08JV-D', 'dish#K08OW-D'], 'Pi': ['CHAF2', 'CHAF2', 'CHAF2'], 'Gi': ['RV1688668060', 'RV1714277379', 'RV1714277380']}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.