简体   繁体   English

如何将列表中的元素逐个附加到数据框中的嵌套列表中

[英]How to append elements from a list into nested lists in a dataframe one by one

import pandas as pd

d = {'A': [1,2,3,4], 'B': [[[1,2],[2,3]],[[3,4],[2,5]],[[5,6],[5,6],[5,6]],[7,8]]}

df = pd.DataFrame(data=d)

C = [1,2,3,4,5,6,7,8]

I have a pandas dataframe and would like to append each element of a C list into each one of the nested lists of B, maintaining the structure, so that the resulting dataframe is: 我有一个pandas数据帧,并希望将C列表的每个元素附加到B的每个嵌套列表中,维护结构,以便生成的数据帧为:

'A': [1,2,3,4]
'B': [[[1,2,1],[2,3,2]],[[3,4,3],[2,5,4]],[[5,6,5],[5,6,6],[5,6,7]],[7,8,8]]

Mybe there is a more elegant solution, but this works :-) Mybe有一个更优雅的解决方案,但这有效:-)

for i in d['B']:
    for j in i:
        if (isinstance(j, list)):
            j.append(C.pop(0))
        else:
            i.append(C.pop(0))
            break

A more efficient solution based on timgebs comment (thank you!): 基于timgebs评论的更有效的解决方案(谢谢!):

f = iter(C)
for i in d['B']:
    for j in i:
        if (isinstance(j, list)):
            j.append(next(f))
        else:
            i.append(next(f))
            break

This is an alternative method using itertools . 这是使用itertools的替代方法。

The idea is to flatten the list of lists, append your data, then split again via information you have stored on the number of lists in each row. 我们的想法是展平列表列表,附加数据,然后通过存储在每行列表数量中的信息再次拆分。

from itertools import chain, accumulate
import pandas as pd

d = {'A': [1,2,3,4], 'B': [[[1,2],[2,3]],[[3,4],[2,5]],[[5,6],[5,6],[5,6]],[[7,8]]]}
df = pd.DataFrame(data=d)
C = [1,2,3,4,5,6,7,8]

acc = [0] + list(accumulate(map(len, B)))

lst = [j+[C[i]] for i, j in enumerate(chain.from_iterable(df['B']))]

df['B'] = [lst[x:y] for x, y in zip(acc, acc[1:])]

Note I have made an important change to the input: the last element of series B is a list of lists, just like all the other elements. 注意我对输入做了一个重要的改变:系列B的最后一个元素是列表列表,就像所有其他元素一样。 For consistency, I would recommend this in any case. 为了保持一致性,无论如何我会建议这样做。

Result 结果

   A                                  B
0  1             [[1, 2, 1], [2, 3, 2]]
1  2             [[3, 4, 3], [2, 5, 4]]
2  3  [[5, 6, 5], [5, 6, 6], [5, 6, 7]]
3  4                        [[7, 8, 8]]
d = {'A': [1,2,3,4], 'B': [[[1,2],[2,3]],[[3,4],[2,5]],[[5,6],[5,6],[5,6]],[7,8]]}

df = pd.DataFrame(data=d)

C = [1,2,3,4,5,6,7,8]

df['B_len'] = df.B.apply(len)
df['B_len_cumsum']=df.B_len.cumsum()
df['C'] = df.apply(lambda row: C[row['B_len_cumsum']-row['B_len']:row['B_len_cumsum']], axis=1)
df['B'] = df.B.apply(lambda x: [x] if type(x[0])==int else x)
for x,y in zip(df.B,df.C):
        for xx,yy in zip(x,y):
            xx.append(yy)
df

Output: 输出:

   A                                  B  B_len  B_len_cumsum          C
0  1             [[1, 2, 1], [2, 3, 2]]      2             2     [1, 2]
1  2             [[3, 4, 3], [2, 5, 4]]      2             4     [3, 4]
2  3  [[5, 6, 5], [5, 6, 6], [5, 6, 7]]      3             7  [5, 6, 7]
3  4                        [[7, 8, 8]]      2             9        [8]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM