如何将 append 列转换为 DataFrame 以收集 Python 中另一个 DataFrame 的值？

Question

I have two tables (as Pandas' DataFrame), one is like我有两个表（作为 Pandas 的 DataFrame），一个就像

name姓名	val值
name1姓名1	0 0
name2名字2	1 1个

the other is另一个是

name姓名	tag标签
name1姓名1	tg1 tg1
name1姓名1	tg2 tg2
name1姓名1	tg3 tg3
name1姓名1	tg3 tg3
name2名字2	kg1公斤1
name2名字2	kg1公斤1
name3名字3	other其他

and I want to append a column to the first DataFrame collecting all values of the second table by name, ie我想 append 一列到第一个 DataFrame 按名称收集第二个表的所有值，即

name姓名	val值	new_column新专栏
name1姓名1	0 0	[tg1, tg2, tg3, tg3] [tg1, tg2, tg3, tg3]
name2名字2	1 1个	[kg1, kg1] [kg1, kg1]

I know I can use row-wise operation to achieve this, but is there a way that I can use inbuilt Pandas' methods to do this?我知道我可以使用逐行操作来实现这一点，但是有没有一种方法可以使用内置的 Pandas 方法来做到这一点？ If I want to remove duplicates of the collected array in new_column at the same time, what method should I use?如果我想同时去除new_column中collected数组的重复项，应该用什么方法呢？

Answer 1

Use DataFrame.join with aggregate list s:将DataFrame.join与聚合list一起使用：

df = df1.join(df2.groupby('name')['tag'].agg(list).rename('new_column'), on='name')
print (df)
    name  val            new_column
0  name1    0  [tg1, tg2, tg3, tg3]
1  name2    1            [kg1, kg1]

如何将 append 列转换为 DataFrame 以收集 Python 中另一个 DataFrame 的值？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-12-01 06:07:10

如何将 append 列转换为 DataFrame 以收集 Python 中另一个 DataFrame 的值？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-12-01 06:07:10

解决方案1
0 已采纳 2022-12-01 06:07:10