简体   繁体   English

从多个 json 文件创建 Pandas dataframe

[英]Creating Pandas dataframe from multiple json files

I have a directory full of JSON files that I need to extract information from and convert into a Pandas dataframe.我有一个充满 JSON 文件的目录,我需要从中提取信息并将其转换为 Pandas dataframe。 My current solution works, but I have a feeling that there is a more elegant way of doing this:我目前的解决方案有效,但我觉得有一种更优雅的方式来做到这一点:

for entry in os.scandir(directory):
    if entry.path.endswith(".json"):
        with open(entry.path) as f:
            data = json.load(f)
            ...
            newline = field1 + ',' + field2 + ',' + ... +  ',' + fieldn
            output.append(newline)
...
df = pd.read_csv(io.StringIO('\n'.join(output)))

Yes, this can be done better.是的,这可以做得更好。

import os
import pandas as pd
from glob import glob

all_files = glob(os.path.join(path, "*.json"))
ind_df = (pd.read_json(f) for f in all_files)
df = pd.concat(ind_df, ignore_index=True)

Using generators will save a lot of computation and memory.使用生成器将节省大量计算和 memory。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM