简体   繁体   English

我如何遍历嵌套字典列表并创建多索引熊猫数据框?

[英]how can I to loop through a list of nested dictionaries and create a multi-indexed panda data frame?

I have the following sample as a dictionary of nested dictionaries.我有以下示例作为嵌套字典的字典。 I want to make it a dataframe where the Key serial global id eg.2O2Fr$t4X7Zf8NOew3FNld will be the column.我想让它成为 dataframe,其中密钥序列全局 ID eg.2O2Fr$t4X7Zf8NOew3FNld 将成为列。 while the values should be rows.而值应该是行。 I tried it but I have not got the right code.我试过了,但我没有得到正确的代码。 I hope to have a multi-index dataframe as a final solution.我希望有一个多索引 dataframe 作为最终解决方案。

Here is the sample list of nested dictionaries and the code I used.这是嵌套字典的示例列表和我使用的代码。

    reformed_dict2 = {('2O2Fr$t4X7Zf8NOew3FNld', 'Pset_WallCommon'): {'Reference': 'Basic Wall:Interior - Partition (92mm Stud)', 
'LoadBearing': False, 'ExtendToStructure': False, 'IsExternal': False, 'id': 4093}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Constraints'): {'Location Line': 2, 'Base Constraint': 'Level 1', 
'Base Offset': 0.0, 'Base is Attached': False, 'Base Extension Distance': 0.0, 'Top Constraint': 'Up to level: Level 2', 'Unconnected Height': 2.795000000000196, 
'Top Offset': -0.3050000000000001, 'Top is Attached': False, 'Top Extension Distance': 0.0, 
'Room Bounding': True, 'Related to Mass': False, 'id': 4142}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Other'): {'InstallationDate': 'InstallationDate', 'SerialNumber': 'SerialNumber', 
'WarrantyStartDate': 'WarrantyStartDate', 'BarCode': 'BarCode', 'AssetIdentifier': 'AssetIdentifier', 
'TagNumber': 'TagNumber', 'id': 4144}, ('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Phasing'): {'Phase Created': 'New Construction', 'id': 4146}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Structural'): {'Structural Usage': 0, 'id': 4148}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Dimensions'): {'Length': 3.791499999999996, 'Area': 10.01448500000069, 'Volume': 1.241796140000085, 'id': 4150}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Type_Construction'): {'Wrapping at Inserts': 0, 'Wrapping at Ends': 2, 'Width': 0.124, 'Function': 0, 'id': 4152}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Type_Graphics'): {'Coarse Scale Fill Color': 0, 'id': 4153}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Type_Identity Data'): {'Manufacturer': 'Manufacturer', 'Assembly Description': '', 'Assembly Code': '', 'id': 4154}, 
('2O2Fr$t4X7Zf8NOew3FNld', 'PSet_Revit_Type_Other'): {'AccessibilityPerformance': 'AccessibilityPerformance', 'CodePerformance': 'CodePerformance', 'Color': 'Color', 
'Constituents': 'Constituents', 'Features': 'Features', 'Finish': 'Finish', 'Grade': 'Grade', 'Material': 'Material', 'ModelReference': 'ModelReference', 
'NominalHeight': 'NominalHeight', 'NominalLength': 'NominalLength', 'NominalWidth': 'NominalWidth', 'ProductionYear': 'ProductionYear', 
'Reference': 'Reference', 'Shape': 'Shape', 'Size': 'Size', 'SustainabilityPerformance': 'SustainabilityPerformance', 
'WarrantyDescription': 'WarrantyDescription', 'WarrantyDurationLabor': 'WarrantyDurationLabor', 'WarrantyDurationParts': 'WarrantyDurationParts', 
'WarrantyGuarantorLabor': 'WarrantyGuarantorLabor', 'WarrantyGuarantorParts': 'WarrantyGuarantorParts', 'ModelNumber': 'ModelNumber', 
'ExpectedLife': 'ExpectedLife', 'ReplacementCost': 'ReplacementCost', 'AssetAccountingType': 'FIXED', 'id': 4155},
('2O2Fr$t4X7Zf8NOew3FNbT', 'Pset_WallCommon'): {'Reference': 'Basic Wall:Party Wall - CMU Residential Unit Dimising Wall', 'LoadBearing': False, 'ExtendToStructure': False, 
'IsExternal': True, 'id': 4246}, ('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Constraints'): {'Location Line': 0, 'Base Constraint': 'Level 1', 'Base Offset': 0.0, 'Base is Attached': False, 
'Base Extension Distance': 0.0, 'Top Constraint': 'Up to level: Level 2', 'Unconnected Height': 2.795000000000196, 'Top Offset': -0.3050000000000002, 
'Top is Attached': False, 'Top Extension Distance': 0.0, 'Room Bounding': True, 'Related to Mass': False, 'id': 4294}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Other'): {'InstallationDate': 'InstallationDate', 'SerialNumber': 'SerialNumber', 'WarrantyStartDate': 'WarrantyStartDate', 'BarCode': 'BarCode', 
'AssetIdentifier': 'AssetIdentifier', 'TagNumber': 'TagNumber', 'id': 4296}, ('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Phasing'): {'Phase Created': 'New Construction', 'id': 4298}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Structural'): {'Structural Usage': 0, 'id': 4300}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Dimensions'): {'Length': 4.191499999999984, 'Area': 11.74179500000078, 'Volume': 5.788704935000469, 'id': 4302}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Type_Construction'): {'Wrapping at Inserts': 0, 'Wrapping at Ends': 0, 'Width': 0.5500000000000002, 'Function': 5, 'id': 4304}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Type_Graphics'): {'Coarse Scale Fill Color': 0, 'id': 4305}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Type_Identity Data'): {'Manufacturer': 'Manufacturer', 'Assembly Description': '', 'Assembly Code': '', 'id': 4306}, 
('2O2Fr$t4X7Zf8NOew3FNbT', 'PSet_Revit_Type_Other'): {'AccessibilityPerformance': 'AccessibilityPerformance', 'CodePerformance': 'CodePerformance', 'Color': 'Color', 
'Constituents': 'Constituents', 'Features': 'Features', 'Finish': 'Finish', 'Grade': 'Grade', 'Material': 'Material', 'ModelReference': 'ModelReference', 
'NominalHeight': 'NominalHeight', 'NominalLength': 'NominalLength', 'NominalWidth': 'NominalWidth', 'ProductionYear': 'ProductionYear', 'Reference': 'Reference', 'Shape': 'Shape', 'Size': 'Size', 
'SustainabilityPerformance': 'SustainabilityPerformance', 'WarrantyDescription': 'WarrantyDescription', 'WarrantyDurationLabor': 'WarrantyDurationLabor', 'WarrantyDurationParts': 'WarrantyDurationParts', 
'WarrantyGuarantorLabor': 'WarrantyGuarantorLabor', 'WarrantyGuarantorParts': 'WarrantyGuarantorParts', 'ModelNumber': 'ModelNumber', 
'ExpectedLife': 'ExpectedLife', 'ReplacementCost': 'ReplacementCost', 'AssetAccountingType': 'FIXED', 'id': 4307}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'Pset_WallCommon'): {'Reference': 'Basic Wall:Party Wall - CMU Residential Unit Dimising Wall', 'LoadBearing': False, 'ExtendToStructure': False, 'IsExternal': True, 'id': 4356}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Constraints'): {'Location Line': 0, 'Base Constraint': 'Level 1', 'Base Offset': 0.0, 'Base is Attached': False, 'Base Extension Distance': 0.0, 
'Top Constraint': 'Up to level: Level 2', 'Unconnected Height': 2.795000000000196, 'Top Offset': -0.3050000000000001, 'Top is Attached': False, 'Top Extension Distance': 0.0, 
'Room Bounding': True, 'Related to Mass': False, 'id': 4372}, ('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Other'): {'InstallationDate': 'InstallationDate', 'SerialNumber': 'SerialNumber', 
'WarrantyStartDate': 'WarrantyStartDate', 'BarCode': 'BarCode', 'AssetIdentifier': 'AssetIdentifier', 'TagNumber': 'TagNumber', 'id': 4374}, ('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Phasing'): {'Phase Created': 'New Construction', 'id': 4376}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Structural'): {'Structural Usage': 0, 'id': 4378}, ('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Dimensions'): {'Length': 4.1915, 'Area': 11.74179500000083, 'Volume': 5.788704935000408, 'id': 4380}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Type_Construction'): {'Wrapping at Inserts': 0, 'Wrapping at Ends': 0, 'Width': 0.5500000000000002, 'Function': 5, 'id': 4304}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Type_Graphics'): {'Coarse Scale Fill Color': 0, 'id': 4305}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Type_Identity Data'): {'Manufacturer': 'Manufacturer', 'Assembly Description': '', 'Assembly Code': '', 'id': 4306}, 
('2O2Fr$t4X7Zf8NOew3FKRi', 'PSet_Revit_Type_Other'): {'AccessibilityPerformance': 'AccessibilityPerformance', 
'CodePerformance': 'CodePerformance', 'Color': 'Color', 'Constituents': 'Constituents', 'Features': 'Features', 'Finish': 'Finish', 'Grade': 'Grade', 'Material': 'Material', 'ModelReference': 'ModelReference', 
'NominalHeight': 'NominalHeight', 'NominalLength': 'NominalLength', 'NominalWidth': 'NominalWidth', 'ProductionYear': 'ProductionYear', 'Reference': 'Reference', 'Shape': 'Shape', 'Size': 'Size', 
'SustainabilityPerformance': 'SustainabilityPerformance', 'WarrantyDescription': 'WarrantyDescription', 'WarrantyDurationLabor': 'WarrantyDurationLabor', 'WarrantyDurationParts': 'WarrantyDurationParts', 
'WarrantyGuarantorLabor': 'WarrantyGuarantorLabor', 'WarrantyGuarantorParts': 'WarrantyGuarantorParts', 'ModelNumber': 'ModelNumber', 'ExpectedLife': 'ExpectedLife', 'ReplacementCost': 'ReplacementCost', 
'AssetAccountingType': 'FIXED', 'id': 4307}}  

The code I used is specified below.我使用的代码在下面指定。 I am a beginner to python and pandas. But I have alot of complex data that I will like to analysis using pandas.我是 python 和 pandas 的初学者。但我有很多复杂的数据,我想使用 pandas 进行分析。

df1 = pd.DataFrame(reformed_dict2)
df1.columns = pd.MultiIndex.from_tuples(df1.columns)
df1

Your code can generate a dataframe without error if you change如果您更改,您的代码可以生成 dataframe 而不会出错

`new_dict = {}` 

to

`reformed_dict = {}`

and change并改变

`sample_list_of_nestdict[0].items`

to

`sample_list_of_nestdict[0].items()`

UPDATE更新

Your loop did parse through all three items in your wall_sets_list.您的循环确实解析了 wall_sets_list 中的所有三个项目。 However, as you would expect that among any two items they share the same set of keys, when you update your reformed_dict for the 2nd item, it overwrote what's already in reformed_dict from the 1st item.但是,正如您所期望的那样,在任何两个项目中,它们共享同一组密钥,当您更新第二个项目的reformed_dict时,它会覆盖第一个项目中已经存在于reformed_dict中的内容。 For example:例如:

reformed_dict[('key1', 'key2')] = 3 # first item
reformed_dict[('key1', 'key2')] = 4 # second item

What remains is just 4 .剩下的只是4 This explains your original code will only give you data about the last item.这说明您的原始代码只会为您提供有关最后一项的数据。

I made changes (noted with # change ) so that now you will have a list of dictionaries, and each dictionary store data for one item.我进行了更改(用# change注明),现在您将拥有一个字典列表,每个字典存储一个项目的数据。

reformed_dicts = [] # change
i = 0
while i<=2:
#Iterating over the sample list of nested dict to create a reformed dict
#for item in wall_sets_list:
    reformed_dict = {} # change
    for outerKey, innerDict in wall_sets_list[i].items():
        for innerKey, values in innerDict.items():
            reformed_dict[(outerKey, innerKey)] = values
            
    reformed_dicts.append(reformed_dict) # change
    i+=1
    #print(reformed_dict)
    #print(len(reformed_dict))
df2 = pd.DataFrame(reformed_dicts) # change
df2.columns = pd.MultiIndex.from_tuples(df2.columns)
df2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM