[英]How to Merge Multiple Panda's DataFrames into an Array for each Column Value Based on Another Column Value
I have several Panda's Dataframes that I would like to merge together.我有几个 Panda 的数据框想要合并在一起。 When I merge them I would like the values that have the same columns to become an array of values.当我合并它们时,我希望具有相同列的值成为一组值。
For example, I would like to merge two data frames together if they have the same value in a specified column.例如,如果两个数据框在指定列中具有相同的值,我想将它们合并在一起。 When they are merged the data becomes an array of values.当它们合并时,数据变成一个值数组。
df1 =
A Value
0 x 0
1 y 0
df2 =
A Value
0 x 1
1 y 1
2 z 1
After Combining:
df =
A Number_Value
0 x [0, 1]
1 y [0, 1]
2 z [, 1]
I do not believe the merge()
or concat()
call would be appropriate.我不相信merge()
或concat()
调用是合适的。 I thought calling .to_numpy() would be able to do this, if I were to convert each value in each row to an array, but that does not seem to work.我认为调用 .to_numpy() 可以做到这一点,如果我要将每一行中的每个值转换为一个数组,但这似乎不起作用。
Use concat
with aggregate list
:将concat
与聚合list
一起使用:
df = pd.concat([df1, df2]).groupby('A', as_index=False).agg(list)
print (df)
A Value
0 x [0, 1]
1 y [0, 1]
2 z [1]
Test DataFrames without A
column:没有A
列的测试数据帧:
L = [df1, df2]
print ([x for x in L if 'A' not in x.columns])
EDIT: For add ''
for empty values add it to fill_value
parameter:编辑:对于为空值添加''
将其添加到fill_value
参数:
L = [df1, df2]
df = pd.concat(L, keys=range(len(L))).reset_index(level=1, drop=True).set_index('A', append=True)
mux = pd.MultiIndex.from_product(df.index.levels)
df = df.reindex(mux, fill_value='').groupby('A').agg(list).reset_index()
print (df)
A Value
0 x [0, 1]
1 y [0, 1]
2 z [, 1]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.