[英]CSV to nested JSON using Python
I need to parse the following CSV data into a nested JSON string.我需要将以下 CSV 数据解析为嵌套的 JSON 字符串。 Please advise how I would go about adding "payment_mode" as a nested value of "cashier".
请告知我将如何添加“payment_mode”作为“收银员”的嵌套值。 I have tried a few things like creating another orderedDict and appending it to subset list but this did not work as desired.
我已经尝试了一些事情,比如创建另一个orderedDict并将其附加到子集列表中,但这并没有达到预期的效果。 Would appreciate any assistance.
将不胜感激任何帮助。
CSV data: CSV 数据:
Contract_no,sales_date,store_sales_amount,cashier_counter,discount_amount,before_tax_amount,tax_amount,cashier_amount,product,dine_in,take_away,mode,amount
CS,2020-04-12,18.50,C1,0,18.50,0,18.50,18.50,0,0,CASH,1068.50
Expected JSON format:预期的 JSON 格式:
{
"contract_no": "CS",
"sales_date": "2020-04-06",
"store_sales_amount": "822.17",
"cashier": [
{
"cashier_counter": "C1",
"discount_amount": "15",
"before_tax_amount": "13.15",
"tax_amount": "219.13",
"cashier_amount": "232.28",
"product":"100.12",
"dine_in":"116.02",
"take_away":"16.14",
"payment_mode": [
{
"mode": "CASH",
"amount": "112.46"
}
]
},
]
}
Current output:电流输出:
{
"contract_no": "CS",
"sales_date": "2020-04-12",
"store_sales_amount": "18.50",
"cashier": [
{
"cashier_counter": "C1",
"discount_amount": "0",
"before_tax_amount": "18.50",
"tax_amount": "0",
"cashier_amount": "18.50",
"product": "18.50",
"dine_in": "0",
"take_away": "0",
"mode": "CASH",
"cash_amount": "18.50"
}
]
} }
Code代码
import pandas as pd
from itertools import groupby
from collections import OrderedDict
import json
#read csv into dataframe
df = pd.read_csv('sales2.csv', dtype={
#level1
"Contract_no" : str,
"sales_date" : str,
"store_sales_amount" : str,
#level2 cashier
"cashier_counter" : str,
"discount_amount" : str,
"before_tax_amount" : str,
"tax_amount" : str,
"cashier_amount" : str,
"product" : str,
"dine_in" : str,
"take_away" : str,
#level3 payment_mode
"mode" : str,
"cash_amount" : str
})
results = []
for (Contract_no, sales_date, store_sales_amount), bag in df.groupby(["Contract_no", "sales_date", "store_sales_amount"]):
#remove 3 variables from array
contents_df = bag.drop(["Contract_no", "sales_date","store_sales_amount"], axis=1)
for (mode, cash_amount), bag2 in contents_df.groupby(["mode", "cash_amount"]):
subset = [OrderedDict(row) for i,row in contents_df.iterrows()]
results.append(OrderedDict([("Contract_no", Contract_no),
("sales_date", sales_date),
("store_sales_amount", store_sales_amount),
("cashier", subset)]))
print (json.dumps(results[0], indent=4))
#with open('ExpectedJsonFile.json', 'w') as outfile:
#outfile.write(json.dumps(results[0], indent=4))
Do you mean this?你是这个意思吗?
import json
csv = """Contract_no,sales_date,store_sales_amount,cashier_counter,discount_amount,before_tax_amount,tax_amount,cashier_amount,product,dine_in,take_away,mode,amount
CS,2020-04-12,18.50,C1,0,18.50,0,18.50,18.50,0,0,CASH,1068.50"""
table = [row.split(",") for row in csv.split("\n")]
""""
0 1 2 3 4 5 6 7 8 9 10 11 12
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
| Contract_no | sales_date | store_sales_amount | cashier_counter | discount_amount | before_tax_amount | tax_amount | cashier_amount | product | dine_in | take_away | mode | amount |
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
| CS | 2020-04-12 | 18.50 | C1 | 0 | 18.50 | 0 | 18.50 | 18.50 | 0 | 0 | CASH | 1068.50 |
|-------------|------------|--------------------|-----------------|-----------------|-------------------|------------|----------------|---------|---------|-----------|------|---------|
"""
jsn = {
"contract_no": table[1][0],
"sales_date": table[1][1],
"store_sales_amount": table[1][2],
"cashier": [
{
"cashier_counter": table[1][3],
"discount_amount": table[1][4],
"before_tax_amount": table[1][5],
"tax_amount": table[1][6],
"cashier_amount": table[1][7],
"product": table[1][8],
"dine_in": table[1][9],
"take_away": table[1][10],
"payment_mode": [
{
"mode": table[1][11],
"amount": table[1][12]
}
]
},
]
}
print (json.dumps(jsn, indent=4))
Output:输出:
{
"contract_no": "CS",
"sales_date": "2020-04-12",
"store_sales_amount": "18.50",
"cashier": [
{
"cashier_counter": "C1",
"discount_amount": "0",
"before_tax_amount": "18.50",
"tax_amount": "0",
"cashier_amount": "18.50",
"product": "18.50",
"dine_in": "0",
"take_away": "0",
"payment_mode": [
{
"mode": "CASH",
"amount": "1068.50"
}
]
}
]
}
I don't know your workflow.我不知道你的工作流程。 Probably the
jsn
can be defined this way as well:可能
jsn
也可以这样定义:
jsn = {
table[0][0]: table[1][0],
table[0][1]: table[1][1],
table[0][2]: table[1][2],
"cashier": [
{
table[0][3]: table[1][3],
table[0][4]: table[1][4],
table[0][5]: table[1][5],
table[0][6]: table[1][6],
table[0][7]: table[1][7],
table[0][8]: table[1][8],
table[0][9]: table[1][9],
table[0][10]: table[1][10],
"payment_mode": [
{
table[0][11]: table[1][11],
table[0][12]: table[1][12]
}
]
},
]
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.