简体   繁体   English

将嵌套字典列表转换为 Pandas Dataframe

[英]Convert list of nested dictionary into Pandas Dataframe

I need to convert a list of nested dictionary into Pandas Dataframe.我需要将嵌套字典列表转换为 Pandas Dataframe。 My list is the following:我的清单如下:

data = [{"2016-09-24":{"totalRevenue":123, "netIncome":456, "ebit":789}}, {"2015-09-24":{"totalRevenue":789, "netIncome":456, "ebit":123}}]

I want to transform the list into something like this, where the dates are column-headers and the rows are the keys to the values in the nested dicts.我想将列表转换成这样的东西,其中日期是列标题,行是嵌套字典中值的键。

我想要作为数据帧的结果

I have tried different things, eg: https://www.tutorialspoint.com/python-convert-list-of-nested-dictionary-into-pandas-dataframe我尝试了不同的东西,例如: https://www.tutorialspoint.com/python-convert-list-of-nested-dictionary-into-pandas-dataframe

But i can't seem to fix my problem.但我似乎无法解决我的问题。

I hope this makes sense and thanks for your help:-)我希望这是有道理的,并感谢您的帮助:-)

Update: I have found a solution:-)更新:我找到了解决方案:-)

Thanks for the notice on how to write questions @HarryPlotter and thanks for the suggested solution @Geoffrey.感谢@HarryPlotter 关于如何写问题的通知,并感谢@Geoffrey 的建议解决方案。
I found an answer to my problem:我找到了我的问题的答案:

pd.concat([pd.DataFrame(l) for l in my_list],axis=1)

Here's my solution.这是我的解决方案。

The for loops could probably be vectorized, but you must watch for correct arrangement of keys. for 循环可能是矢量化的,但您必须注意键的正确排列。 Dictionaries are stored using hash maps and the order is not always returned in the same way.字典是使用 hash 映射存储的,并且顺序并不总是以相同的方式返回。 In fact, you might run some code and see that the keys are always returned in the same order, but this behavior is not always guaranteed, so I chose to use for loops.实际上,您可能会运行一些代码并看到键总是以相同的顺序返回,但这种行为并不总是得到保证,所以我选择使用 for 循环。

Also, you must:此外,您必须:

import pandas as pd

In case that wasn't already clear...万一这还不是很清楚...

#Function can be iterated over list of multiple nested dictionaries.
def toPandasDF(listOfDicts,listEl):
    
    #Grab first key of outer dictionary.
    firstKey = next(iter(listOfDicts[listEl]))
    
    #Extract rows (indices) of df and headers of df using first key.
    indices = list(listOfDicts[listEl][firstKey].keys())
    headers = list(listOfDicts[listEl].keys())
    
    #Initialize df.
    df = pd.DataFrame(columns = headers, index = indices)
    
    #Store relevant information in respective df elements. (could be vectorized)
    for row in range(len(indices)):
        for col in range(len(headers)):
            df.iat[row,col] = listOfDicts[listEl][headers[col]][indices[row]]
    
    #Return df
    return df

One more thing, here's how I'd iterate over a list of dicts and extract multiple data frames:还有一件事,这是我如何遍历一个字典列表并提取多个数据帧:

for k in range(len(listOfDicts)):
    df = toPandasDF(k)

but without an example, its tough to tell if this would work for your application.但如果没有示例,很难判断这是否适用于您的应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM