简体   繁体   English

Python 使用包含浮动元素作为单个元素的列表转换嵌套字典

[英]Python converting nested dictionary with list containing float elements as individual elements

I'm collecting values from different arrays and nested dictionary containing list values, like below.我正在从不同的数组和包含列表值的嵌套字典中收集值,如下所示。 The lists contains millions of rows, I tried pandas dataframe concatenation But getting out of memory, so I resorted to a for loop.列表包含数百万行,我尝试了 Pandas 数据帧连接但内存不足,所以我求助于 for 循环。

array1_str = ['user_1', 'user_2', 'user_3','user_4' , 'user_5']
array2_int = [3,3,1,2,4]
nested_dict_w_list = {'outer_dict' : { 'inner_dict' : [[1.0001],[2.0033],[1.3434],[2.3434], [0.44224]}}
    
final_out = [array1_str[i], array2_int[i], nested_dict_w_list['outer_dict']['inner_dict'][array2_int[i]]] for i in range(len(array2_int))]

I'm getting the output as我得到的输出为

user_1, 3, [2.3434]
user_2, 3, [2.3434]
user_3, 1, [1.0001]
user_4, 2, [1.3434]
user_5, 4, [0.44224]

But I want the output as但我希望输出为

user_1, 3, 2.3434
user_2, 3, 2.3434
user_3, 1, 1.0001
user_4, 2, 1.3434
user_5, 4, 0.44224

I need to eventually convert this to parquet file, I'm using spark dataframe to convert this to parquet, but the schema is appearing as array(double)).我需要最终将其转换为镶木地板文件,我使用 spark 数据框将其转换为镶木地板,但架构显示为数组(双)。 But I need it as just double.但我需要它作为两倍。 Any input is appreciated.任何输入表示赞赏。

The below for loop is working, but any other efficient and elegant solution.下面的 for 循环正在工作,但任何其他有效和优雅的解决方案。

final_output = []
for i in range(len(array2_int)-1)):
  index = nested_dict_w_list['outer_dict']['inner_dict'][array2_int[i]]
  final_output.append(array1_str[i], array2_int[i], index[0])

You can modify your original list comprehension, by indexing to item zero:您可以通过索引到零项来修改原始列表理解:

final_out = [
    (array1_str[i], array2_int[i], nested_dict_w_list['outer_dict']['inner_dict'][array2_int[i]][0])
    for i in range(len(array2_int))
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM