简体   繁体   中英

How to merge multiple json files (metadata from website) into one json file dor EDA in python?

I am dealing with 2 folders of mixed json files and collected audios (around 400 json files), looking like this:

{
  "id":"c79c32e7-6665-4c5e-9458-d15930488263",
  "age":"34",
  "gender":"m",
  "healthStatus":"healthy",
  "audioFiles":[
    "1585940317337_sentence_healthy_m_34_c79c32e7-6665-4c5e-9458d15930488263.wav",
    "1585940317337_cough_healthy_m_34_c79c32e7-6665-4c5e-9458-d15930488263.wav",
    "1585940317337_breath_healthy_m_34_c79c32e7-6665-4c5e-9458d15930488263.wav"
  ]
}

I want to retrieve age , gender and healthStatus and merge them into one JSON file for analysis in python.

To do this, i wrote:

from pathlib import Path
import json
data_folder = Path("/Users/jiani/Desktop/Voicemed/#ml/cough_classification-original_experiment/new_data/meta1")
read_files = glob.glob("data_folder/.json")

output_list = []

for f in read_files:
    with open(f, "rb") as infile:
        output_list.append(json.load(infile))

with open("merged_file.json", "w") as outfile:
    json.dump(output_list, outfile)

and then I printed output_list , but I get an empty one. I have read some related solutions, but I still couldn't get the answer out. Could someone help me?

Thank you very much.

Try this:

from os import listdir
from os.path import isfile, join
import json

data_folder = "full/path/to/jsons"
files = [join(data_folder,f) for f in listdir(data_folder) if isfile(join(data_folder, f)) and f.endswith(".json")]

output_list = []
for file in files:
    with open(file, "r") as f:
        output_list.append({k:v for k,v in json.load(f).items() if k in ["age","gender","healthStatus"]})

with open("merged_file.json", "w") as outfile:
    json.dump(output_list, outfile)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM