简体   繁体   English

带有嵌套列表的Python字典键到Pandas DataFrame

[英]Python Dictionary keys with nested list to pandas DataFrame

I have a dictionary as follows: 我有以下字典:

D = {
    'd1': [[a1, a1, a1], [a2, a2, a2], [a3, a3, a3]], 
    'd2': [[b1, b1, b1], [b2, b2, b2], [b3, b3, b3]], 
    'd3': [[c1, c1, c1], [c2, c2, c2], [c3, c3, c3]], 
    'd4': [[d1, d1, d1], [d2, d2, d2], [d3, d3, d3]]
}

How do I convert it to a dataframe such that 我如何将其转换为这样的数据框

  • The columns from the lists for a key are paired up; 密钥列表中的列已配对。 the nested lists are time values, temperatures and damage values, respectively, and the dataframe needs to have these in separate columns. 嵌套列表分别是时间值,温度和损坏值,数据框需要将它们放在单独的列中。 S for [[a1, a1, a1], [a2, a2, a2], [a3, a3, a3]] , you'd get a row with a1, a2, a3 (first column), followed by a row for the 2nd column, etc. 对于[[a1, a1, a1], [a2, a2, a2], [a3, a3, a3]] S,您将获得一行包含a1, a2, a3 (第一列),然后是一行第二列,依此类推

  • The dataframe rows are grouped by combining keys with the next key, d1 combined with d2 make 6 rows (3 from d1 and 3 from d2 ), then d2 is combined with d3 to make 6 more rows, etc. So for the 4 keys with 3 rows each, you get 3 combinations of 6 rows == 18 rows. 数据帧行由键与下一个键组合分组, d1联合d2使6行(3从d1和3从d2 ),则d2与组合d3使6个更多的行,等等所以对于4个键与每个3行,您将获得6行== 18行的3种组合。

I tried converting to a dataframe before concatenating: 我尝试在连接之前转换为数据框:

new_df = pd.DataFrame(list(D.values()), columns=['Time_sec', 'Temperature', 'Damage'])

but I am still stuck with concatenating part. 但是我仍然停留在连接部分上。

Sample expected output: 样本预期输出:

预期输出DataFrame

You want to zip() together each sublist for a given key, to form new rows with values from each sublist combined: 您想要将给定键的每个子列表zip()在一起,以结合每个子列表中的值形成新行:

>>> list(zip(*D['d1']))
[('a1', 'a2', 'a3'), ('a1', 'a2', 'a3'), ('a1', 'a2', 'a3')]

then apply this to every value in the dictionary to produce a flattened sequence of rows, where you pick your pairings. 然后将此值应用于字典中的每个值,以生成扁平的行序列,您可以在其中选择配对。

I'm assuming you want to pair up dN with dN+1 here, regardless of the number of keys. 我假设您要在此处将dNdN+1配对,而不考虑密钥的数量。 Note that dictionaries are actually unordered (although Python 3.6 and up insertion order is preserved), so you may want to apply some sorting first: 请注意,字典实际上是无序的 (尽管保留了Python 3.6和向上插入的顺序),因此您可能需要首先应用一些排序:

sorted_keys = sorted(D)

after which we can pair them up with zip(sorted_keys, sorted_keys[1:]) : 之后,我们可以将它们与zip(sorted_keys, sorted_keys[1:])

>>> sorted_keys = sorted(D)
>>> list(zip(sorted_keys, sorted_keys[1:]))
[('d1', 'd2'), ('d2', 'd3'), ('d3', 'd4')]

Use this sequence to pair up the rows and flatten the resulting key sequence, then the zipped rows: 使用此序列将行配对并展平键序列,然后压缩行:

sorted_keys = sorted(D)
paired = (k for keys in zip(sorted_keys, sorted_keys[1:]) for k in keys)
df = pd.DataFrame(
    (row for k in paired for row in zip(*D[k])), 
    columns=['Time_sec', 'Temperature', 'Damage']
)

This produces: 这将产生:

   Time_sec Temperature Damage
0        a1          a2     a3
1        a1          a2     a3
2        a1          a2     a3
3        b1          b2     b3
4        b1          b2     b3
5        b1          b2     b3
6        b1          b2     b3
7        b1          b2     b3
8        b1          b2     b3
9        c1          c2     c3
10       c1          c2     c3
11       c1          c2     c3
12       c1          c2     c3
13       c1          c2     c3
14       c1          c2     c3
15       d1          d2     d3
16       d1          d2     d3
17       d1          d2     d3

Using enumerate 使用枚举

l = ['Time', 'Temperature', 'Damage']
d2 = {}

for idx, item in enumerate(l):
    for k, v in d.items():
        if item not in d2:
            d2[item] = v[idx]
        else:
            d2[item] += v[idx]
 {'Time': ['a1, a1, a1', 'b1, b1, b1', 'c1, c1, c1', 'd1, d1, d1'], 'Temperature': ['a2, a2, a2', 'b2, b2, b2', 'c2, c2, c2', 'd2, d2, d2'], 'Damage': ['a3, a3, a3', 'b3, b3, b3', 'c3, c3, c3', 'd3, d3, d3']} 

Using pseudo values 使用伪值

a1, a2, a3  = 0, 'a', '!'
b1, b2, b3  = 0, 'a', '!'
c1, c2, c3  = 0, 'a', '!'
d1, d2, d3  = 0, 'a', '!'
 {'Time': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'Temperature': ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a'], 'Damage': ['!', '!', '!', '!', '!', '!', '!', '!', '!', '!', '!', '!']} 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM