Pandas - 将CSV转换为JSON - 使用groupby进行组织，然后删除索引键

Question

I have a CSV file that I'm trying to convert to JSON with Pandas. 我有一个CSV文件，我正在尝试使用Pandas转换为JSON。 It has multiple headers but for the sake of simplicity let's just say it has 3: "region", "state", and "salesperson". 它有多个标题但是为了简单起见，我们只说它有3个：“region”，“state”和“salesperson”。 Three columns, three rows that often have repeating values within (state names and such). 三列，三行通常具有重复值（州名等）。

My ideal result is: 我理想的结果是：

{
    "salesperson": [
        {
            "name": "John Doe",
            "values": [
                {
                    "region": "North America",
                    "state": "Connecticut"
                },
                {
                    "region": "North America",
                    "state": "Vermont"
                }
            ]
        },
        {
            "name": "Jane Doe",
            "values": [
                {
                    "region": "North America",
                    "state": "New York"
                },
                {
                    "region": "North America",
                    "state": "New Hampshire"
                }
            ]
        }
    ]
}

This is what I currently have for reading the data and turning it into JSON. 这就是我目前用于读取数据并将其转换为JSON的方法。

df = pd.read_csv('Foo.csv', encoding="ISO-8859-1",
                    escapechar='\\')
result = (df.groupby(['salesperson'])
            .apply(lambda x: x.to_dict('r'))
            .to_json(orient='table')
            )
return result

.to_json(orient='table') is close, it gives me .to_json(orient='table')很接近，它给了我

"data": [
    {
        "salesperson": "John Doe",
        "values": [
            {
                "region": "North America",
                "state": "Connecticut",
                "salesperson": "John Doe"
            },

However "salesperson" is still in the "values". 然而，“销售人员”仍然处于“价值观”。 I've tried 我试过了

result = (df.groupby(['salesperson'])
            .apply(lambda x: x.to_dict('r'))
            .drop('salesperson')
            .to_json(orient='table')
            )

But that doesn't seem to be the correct way. 但这似乎不是正确的方法。

I'm not sure how to tell it to use "salesperson" as the index & remove it from the output, without actually editing the JSON file after it's created. 我不知道如何告诉它使用“salesperson”作为索引并将其从输出中删除，而不是在创建后实际编辑JSON文件。

Answer 1

The code below deletes the key not needed. 下面的代码删除了不需要的密钥。

Step-1: 第1步：

Assign a variable. 分配变量。

data = {
        "salesperson": "John Doe",
        "values": [
            {
                "region": "North America",
                "state": "Connecticut",
                "salesperson": "John Doe"
            }]
       }

Step-2: Delete key 第2步：删除密钥

del data['salesperson']

Output: 输出：

Answer 2

I had to apply drop to the key before applying to_dict() 我不得不申请drop申请前的关键to_dict()

result = df.groupby(df.salesperson).apply(
    lambda x: x.drop('salesperson', 1).to_dict('records')).to_json(orient='index')

This removed the key from the resulting JSON values while preserving it as the index. 这会从生成的JSON值中删除密钥，同时将其保留为索引。

Pandas - 将CSV转换为JSON - 使用groupby进行组织，然后删除索引键

问题描述

2 个解决方案

解决方案1
0 2019-04-07 00:59:00

解决方案2
0 2019-04-10 21:38:40

Pandas - 将CSV转换为JSON - 使用groupby进行组织，然后删除索引键

问题描述

2 个解决方案

解决方案1 0 2019-04-07 00:59:00

解决方案2 0 2019-04-10 21:38:40

解决方案1
0 2019-04-07 00:59:00

解决方案2
0 2019-04-10 21:38:40