将带有 JSON 列列表的 DataFrame 转换为 JSON

Question

我正在尝试以块的形式读取 a.csv 文件，并将这些块转换为 JSON。 问题是，csv 有一列是 json 对象的列表（在这种情况下是replies ）：

_id,title,description,count,replies
859f41bd,thr,hrt,5,[]
2816b949,fasd,asdf,2,[{'id': '1e8djah', 'description': 'hey'}]

当我做

    for chunk in pd.read_csv(FILE_NAME, chunksize=BATCH_SIZE):
        chunk_to_json = pd.DataFrame.to_json(chunk, orient='records')

chunk_to_json将回复列表作为字符串，而不是列表：

"replies":"[{'id': '1e8djah', 'description': 'hey'}]"

虽然我在执行object时看到列类型是dtypes 。 并且执行chunk['replies'].apply(lambda x: json.loads(x))会返回错误json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 3 (char 2) 。 我希望 output 是：

"replies": [{'id': '1e8djah', 'description': 'hey'}]

是否可以轻松解析这个？ 我还可以选择修改如何将数据放入.csv。 我使用 pandas 的to_json将replies放在 csv 中，所以双引号要求问题似乎很奇怪。

Answer 1

单引号不是有效的 JSON。 您可以简单地用双引号替换单引号：

import json

input_string = "[{'id': '1e8djah', 'description': 'hey'}]"
input_string = input_string.replace("\'", "\"")

result = json.loads(input_string)

print(result)

Output：

[{'id': '1e8djah', 'description': 'hey'}]

将带有 JSON 列列表的 DataFrame 转换为 JSON

问题描述

1 个解决方案

解决方案1
1 2022-01-03 12:07:49

将带有 JSON 列列表的 DataFrame 转换为 JSON

问题描述

1 个解决方案

解决方案1 1 2022-01-03 12:07:49

解决方案1
1 2022-01-03 12:07:49