[英]How do I create a specific nesting format JSON or dictionary from a pandas dataframe using vectorized operations?
I am attempting to make an API call.我正在尝试拨打 API 电话。 For this specific API, one of the keys in the JSON file needs to have a nested dictionary inside of it.
对于这个特定的 API,JSON 文件中的一个键需要在其中包含一个嵌套字典。
here is the input data in dataframe format:这是 dataframe 格式的输入数据:
ID Date Total_Transactions Amount Account_Name__c
1234567 2022-12-21 1 235.00 a1234567
2345678 2022-13-21 2 300.50 a2345678
The end result needs to look like this with a key "Account_Name__r" outside of the nested dictionary:最终结果需要看起来像这样,在嵌套字典之外有一个键“Account_Name__r”:
[{'ID': '1234567',
'Date': '2022-12-21',
'Total_Transactions': 1,
'Amount': 235.00,
'Account_Name__r': {'Account_Name__c':'a1234567'}},
{'ID': '2345678',
'Date': '2022-13-21',
'Total_Transactions': 2,
'Amount': 300.50,
'Account_Name__r': {'Account_Name__c':'a2345678'}}]
The data is coming from a data frame.数据来自数据框。 I can get a normal data frame to export properly, but having issues with the nesting.
我可以获得正常的数据框以正确导出,但嵌套有问题。 Here's what it looks like when I do the normal dataframe as a normal json:
这是当我将正常的 dataframe 作为正常的 json 执行时的样子:
code:代码:
final.to_json(orient='records')
output: output:
[{'ID': '1234567',
'Date': '2022-12-21',
'Total_Transactions': 1,
'Amount': 235.00,
'Account_Name__c':'a1234567'},
{'ID': '2345678',
'Date': '2022-13-21',
'Total_Transactions': 2,
'Amount': 300.50,
'Account_Name__c':'a2345678'}]
Any ideas how i need to structure my dataframe and what transformations/functions I need to use to get the nested structure I have at the top?有什么想法我需要如何构建我的 dataframe 以及我需要使用哪些转换/函数来获得我在顶部的嵌套结构? I am looking to achieve this by performing vectorized operations in pandas and by using the df.to_json() method in pandas.
我希望通过在 pandas 中执行矢量化操作并在 pandas 中使用 df.to_json() 方法来实现这一点。
I am not looking for a for loop solution, that is easy but does not actually help me learn how to create different kinds of complex JSON structures from a pandas dataframe and in my case is not scalable for the large datasets I'll be using.我不是在寻找 for 循环解决方案,这很简单,但实际上并不能帮助我学习如何从 pandas dataframe 创建不同类型的复杂结构 JSON 并且在我的情况下对于我将使用的大型数据集不可扩展。
Try this:试试这个:
data=[{'ID': '1234567',
'Date': '2022-12-21',
'Total_Transactions': 1,
'Amount': 235.00,
'Account_Name__c':'a1234567'},
{'ID': '2345678',
'Date': '2022-13-21',
'Total_Transactions': 2,
'Amount': 300.50,
'Account_Name__c':'a2345678'}]
df=pd.DataFrame(data)
df["Account_Name__r"]=df["Account_Name__c"].apply(lambda x: {"Account_Name__c":x})
df.drop(columns=["Account_Name__c"],inplace=True)
print(df.to_json(orient='records'))
Try:尝试:
out = df.to_dict(orient="records")
for d in out:
d["ID"] = str(d["ID"])
d["Account_Name__r"] = {"Account_Name__c": d.pop("Account_Name__c")}
print(out)
Prints:印刷:
[
{
"ID": "1234567",
"Date": "2022-12-21",
"Total_Transactions": 1,
"Amount": 235.0,
"Account_Name__r": {"Account_Name__c": "a1234567"},
},
{
"ID": "2345678",
"Date": "2022-13-21",
"Total_Transactions": 2,
"Amount": 300.5,
"Account_Name__r": {"Account_Name__c": "a2345678"},
},
]
I found the answer by breaking this down into a smaller problem to solve.我通过将其分解为一个较小的问题来解决,从而找到了答案。 I posted the question here: Is there a way to store a dictionary on each row of a dataframe column using a vectorized operation?
我在这里发布了问题: 有没有一种方法可以使用向量化操作在 dataframe 列的每一行上存储字典?
User Panda Kim gets credit for solving the initial problem: https://stackoverflow.com/users/20430449/panda-kim用户 Panda Kim 因解决最初的问题而获得荣誉: https://stackoverflow.com/users/20430449/panda-kim
This is the solution using the answer that Panda Kim uses along with the final step needed that I pieced together.这是使用 Panda Kim 使用的答案以及我拼凑的所需的最后一步的解决方案。
First, we name a new column named for the key we'll use later outside of the wrapped dictionary and we'll get the values for the column by transposing the column name 'Account_Name__c' and it's corresponding value using the.T method, and setting it to a dictionary using to_dict()首先,我们命名一个新列,以我们稍后将在包装字典之外使用的键命名,我们将通过转置列名 'Account_Name__c' 和它使用 .T 方法的对应值来获取该列的值,并且使用 to_dict() 将其设置为字典
final_insert['Account_Name__r'] = pd.Series(final_insert[['Account_Name__c']].T.to_dict())
the result:结果:
ID Date Total_Transactions Account_Name__r
1234567 2022-12-21 1 {'Account_Name__c':'a1234567'}
Finally, we then transform the entire dataframe to a dictionary or a json using either.to_dict() or.to_json()最后,我们然后使用 .to_dict() 或 .to_json() 将整个 dataframe 转换为字典或 json
final_insert = final_insert.to_dict(orient='records')
The result:结果:
[{'ID': '1234567',
'Date': '2022-12-21',
'Total_Transactions': 1,
'Amount': 235.00,
'Account_Name__r': {'Account_Name__c':'a1234567'}}]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.