简体   繁体   English

使用 Python 将 CSV 转换为嵌套的 JSON

[英]CSV to nested JSON using Python

I need to parse the following CSV data into a nested JSON string.我需要将以下 CSV 数据解析为嵌套的 JSON 字符串。 Please advise how I would go about adding "payment_mode" as a nested value of "cashier".请告知我将如何添加“payment_mode”作为“收银员”的嵌套值。 I have tried a few things like creating another orderedDict and appending it to subset list but this did not work as desired.我已经尝试了一些事情,比如创建另一个orderedDict并将其附加到子集列表中,但这并没有达到预期的效果。 Would appreciate any assistance.将不胜感激任何帮助。

CSV data: CSV 数据:

Contract_no,sales_date,store_sales_amount,cashier_counter,discount_amount,before_tax_amount,tax_amount,cashier_amount,product,dine_in,take_away,mode,amount
CS,2020-04-12,18.50,C1,0,18.50,0,18.50,18.50,0,0,CASH,1068.50

Expected JSON format:预期的 JSON 格式:

    {
    "contract_no": "CS",
    "sales_date": "2020-04-06",
    "store_sales_amount": "822.17",
    "cashier": [
        {
            "cashier_counter": "C1",
            "discount_amount": "15",
            "before_tax_amount": "13.15",
            "tax_amount": "219.13",
            "cashier_amount": "232.28",
          "product":"100.12",
          "dine_in":"116.02",
          "take_away":"16.14",
            "payment_mode": [
                {
                    "mode": "CASH",
                    "amount": "112.46"
                }
            ]
        },
    ]
}

Current output:电流输出:

{
"contract_no": "CS",
"sales_date": "2020-04-12",
"store_sales_amount": "18.50",
"cashier": [
    {
        "cashier_counter": "C1",
        "discount_amount": "0",
        "before_tax_amount": "18.50",
        "tax_amount": "0",
        "cashier_amount": "18.50",
        "product": "18.50",
        "dine_in": "0",
        "take_away": "0",
        "mode": "CASH",
        "cash_amount": "18.50"
    }
]

} }

Code代码

import pandas as pd
from itertools import groupby 
from collections import OrderedDict
import json    

#read csv into dataframe
df = pd.read_csv('sales2.csv', dtype={
        #level1
        "Contract_no" : str,
        "sales_date" : str,
        "store_sales_amount" : str,
        #level2 cashier
        "cashier_counter" : str,
        "discount_amount" : str,
        "before_tax_amount" : str,
        "tax_amount" : str,
        "cashier_amount" : str,
        "product" : str,
        "dine_in" : str,
        "take_away" : str,
        #level3 payment_mode
        "mode" : str,
        "cash_amount" : str         
    })
    
results = []

for (Contract_no, sales_date, store_sales_amount), bag in df.groupby(["Contract_no", "sales_date", "store_sales_amount"]):
#remove 3 variables from array
    contents_df = bag.drop(["Contract_no", "sales_date","store_sales_amount"], axis=1)
    for (mode, cash_amount), bag2 in contents_df.groupby(["mode", "cash_amount"]):
        subset = [OrderedDict(row) for i,row in contents_df.iterrows()]
        results.append(OrderedDict([("Contract_no", Contract_no),
                                    ("sales_date", sales_date),
                                    ("store_sales_amount", store_sales_amount),
                                    ("cashier", subset)]))

print (json.dumps(results[0], indent=4))
#with open('ExpectedJsonFile.json', 'w') as outfile:
#outfile.write(json.dumps(results[0], indent=4))

Do you mean this?你是这个意思吗?

import json

csv = """Contract_no,sales_date,store_sales_amount,cashier_counter,discount_amount,before_tax_amount,tax_amount,cashier_amount,product,dine_in,take_away,mode,amount
CS,2020-04-12,18.50,C1,0,18.50,0,18.50,18.50,0,0,CASH,1068.50"""

table = [row.split(",") for row in csv.split("\n")]

""""
  0             1            2                    3                 4                 5                   6            7                8         9         10          11     12
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
| Contract_no | sales_date | store_sales_amount | cashier_counter | discount_amount | before_tax_amount | tax_amount | cashier_amount | product | dine_in | take_away | mode | amount  |
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
| CS          | 2020-04-12 | 18.50              | C1              | 0               | 18.50             | 0          | 18.50          | 18.50   | 0       | 0         | CASH | 1068.50 |
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
"""

jsn = {
    "contract_no":               table[1][0],
    "sales_date":                table[1][1],
    "store_sales_amount":        table[1][2],
    "cashier": [
        {
            "cashier_counter":   table[1][3],
            "discount_amount":   table[1][4],
            "before_tax_amount": table[1][5],
            "tax_amount":        table[1][6],
            "cashier_amount":    table[1][7],
            "product":           table[1][8],
            "dine_in":           table[1][9],
            "take_away":         table[1][10],
            "payment_mode": [
                {
                    "mode":      table[1][11],
                    "amount":    table[1][12]
                }
            ]
        },
    ]
}

print (json.dumps(jsn, indent=4))

Output:输出:

{
    "contract_no": "CS",
    "sales_date": "2020-04-12",
    "store_sales_amount": "18.50",
    "cashier": [
        {
            "cashier_counter": "C1",
            "discount_amount": "0",
            "before_tax_amount": "18.50",
            "tax_amount": "0",
            "cashier_amount": "18.50",
            "product": "18.50",
            "dine_in": "0",
            "take_away": "0",
            "payment_mode": [
                {
                    "mode": "CASH",
                    "amount": "1068.50"
                }
            ]
        }
    ]
}

I don't know your workflow.我不知道你的工作流程。 Probably the jsn can be defined this way as well:可能jsn也可以这样定义:

jsn = {
    table[0][0]: table[1][0],
    table[0][1]: table[1][1],
    table[0][2]: table[1][2],
    "cashier": [
        {
            table[0][3]:  table[1][3],
            table[0][4]:  table[1][4],
            table[0][5]:  table[1][5],
            table[0][6]:  table[1][6],
            table[0][7]:  table[1][7],
            table[0][8]:  table[1][8],
            table[0][9]:  table[1][9],
            table[0][10]: table[1][10],
            "payment_mode": [
                {
                    table[0][11]: table[1][11],
                    table[0][12]: table[1][12]
                }
            ]
        },
    ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM