使用列表理解迭代字典的嵌套列表

Question

我有一堆包含气象数据的文本文件。 每个文本文件存储一个半小时的数据，即18000个观测值（行）。 总共有48个文件（一整天），并且我已经按照以下结构存储了所有数据：

# all_data is a list of dictionaries, len=48 --> each dict represents one file

all_data = [{'time': 0026,
             'filename': 'file1.txt',
               # all_data['data'] is a list of dictionaries, len=18000
               # each dict in all_data['data'] represents one line of corresponding file
             'data': [{'x': 1.345, 'y': -0.779, 'z': 0.023, 'temp': 298.11},
                      {'x': 1.277, 'y': -0.731, 'z': 0.086, 'temp': 297.88},
                      ...,
                      {'x': 2.119, 'y': 1.332, 'z': -0.009, 'temp': 299.14}]
             },

             {'time': 0056,
              'filename': 'file2.txt',
              'data': [{'x': 1.216, 'y': -0648, 'z': 0.881, 'temp': 301.11},
                      {'x': 0.866, 'y': 0.001, 'z': 0.031, 'temp': 301.32},
                      ...,
                      {'x': 0.181, 'y': 0.498, 'z': 0.101, 'temp': 300.91}]
             },
             ...
             ]

现在，我需要打开包装。 我需要按顺序创建x的所有值的列表（ all_data[i]['data'][j]['x'] ），以用于绘制。 幸运的是，数据已经按顺序存储。

我知道我可以做这样的事情来实现自己的目标：

x_list = []
for dictionary in all_data:
    for record in dictionary['data']: # loop over list of dictionaries
         x_list.append(record['x'])

但是我必须对许多为简化起见在这里未列出的变量做类似的事情，我真的不想重写此循环20次或手工创建20个新列表。

有没有一种方法可以使用列表理解来迭代这样的嵌套数据结构？

我投身祈祷并尝试：

[x for x in all_data[i for i in len(all_data)]['data'][j for j in len(all_data[i]['data'])]

哪个当然不起作用。 有任何想法吗？

这是我想要的输出 ，它只是嵌套列表'data'中'x'的值：

all_x = [1.345, 1.277, ..., 2.119, 1.216, 0.866, ..., 0.181, ...]

提前致谢！

Answer 1

from itertools import chain
[ k['x'] for k in chain.from_iterable([ i['data'] for i in all_data ]) ]

Answer 2

您可以尝试以下方法：

import itertools
all_data = [{'time': 0026, 'filename': 'file1.txt', 'data': [{'x': 1.345, 'y': -0.779, 'z': 0.023, 'temp': 298.11}, {'x': 1.277, 'y': -0.731, 'z': 0.086, 'temp': 297.88}, {'x': 2.119, 'y': 1.332, 'z': -0.009, 'temp': 299.14}]},
        {'time': 0056, 'filename': 'file2.txt','data': [{'x': 1.216, 'y': -648, 'z': 0.881, 'temp': 301.11}, {'x': 0.866, 'y': 0.001, 'z': 0.031, 'temp': 301.32},{'x': 0.181, 'y': 0.498, 'z': 0.101, 'temp': 300.91}]}]

x_data = list(itertools.chain.from_iterable([[b["x"] for b in i["data"]] for i in all_data]))
print(x_data)

输出：

[1.345, 1.277, 2.119, 1.216, 0.866, 0.181]

Answer 3

如果您不介意使用Pandas，这可能是实现所需功能的好方法。 运行dataDfList = [pandas.DataFrame(f['data']) for f in all_data]将生成DataFrames的列表，每个看起来像： | | temp | x | y | z | |------|--------|-------|--------|--------| | 0 | 298.11 | 1.345 | -0.779 | 0.023 | | 1 | 297.88 | 1.277 | -0.731 | 0.086 | | 2 | 299.14 | 2.119 | 1.332 | -0.009 | | | temp | x | y | z | |------|--------|-------|--------|--------| | 0 | 298.11 | 1.345 | -0.779 | 0.023 | | 1 | 297.88 | 1.277 | -0.731 | 0.086 | | 2 | 299.14 | 2.119 | 1.332 | -0.009 | 然后可以轻松地绘制每个图。 您也可以使用MultiIndex完成此操作，例如，使用pandas.concat(dataDfList)堆叠数据帧列表

Answer 4

如果我对您的理解正确，那么您想要的输出是：

一个列表
每个元素都是一个子列表，它是变量x-> z，temp的值

不仅列出x值。

然后这是您的代码：

values = [row.values() for day in all_data for row in day['data']]

values中的每一项都是x-> z / temp的变量值列表，或向量值矩阵。

对于您上面的示例数据，输出为：

[[-0.779, 1.345, 0.023, 298.11], [-0.731, 1.277, 0.086, 297.88], [1.332, 2.119, -0.009, 299.14], [-0.648, 1.216, 0.881, 301.11], [0.001, 0.866, 0.031, 301.32], [0.498, 0.181, 0.101, 300.91]]

对应于['x', 'y', 'z', 'temp']变量。

编辑：如果要提取一个变量的值，请使用numpy ，将输出转换为数组并提取相应的列。

使用列表理解迭代字典的嵌套列表

问题描述

4 个解决方案

解决方案1
2 2017-10-05 16:53:05

解决方案2
1 已采纳 2017-10-05 16:52:26

解决方案3
1 2017-10-05 16:53:13

解决方案4
0 2017-10-05 17:00:44

使用列表理解迭代字典的嵌套列表

问题描述

4 个解决方案

解决方案1 2 2017-10-05 16:53:05

解决方案2 1 已采纳 2017-10-05 16:52:26

解决方案3 1 2017-10-05 16:53:13

解决方案4 0 2017-10-05 17:00:44

解决方案1
2 2017-10-05 16:53:05

解决方案2
1 已采纳 2017-10-05 16:52:26

解决方案3
1 2017-10-05 16:53:13

解决方案4
0 2017-10-05 17:00:44