[英]Pandas merging selected columns into 1
I have a df like this: 我有这样的df:
ID1 ID2 Day Text1 Text2 Text3 ....
111 A 1 a b c
222 B 2 i j k
333 C 3 x y z
My goal is to create a new columns that contain all values of Text1, Text2, Text3, and so on. 我的目标是创建一个包含Text1,Text2,Text3等所有值的新列。
ID1 ID2 Day Text1 Text2 Text3 .... Text
111 A 1 a b c a, b, c...
222 B 2
333 C 3 x y x, y, ....
I've tried: 我试过了:
list(zip(df.Text1,df.Text2,df.Text3,...)):
This works but the format isn't desirable. 这有效,但格式不可取。
And: 和:
df.apply(lambda x: ', '.join(x.astype(str)), axis=1):
This gives the desired format but the answer will contains all fields. 这给出了所需的格式,但答案将包含所有字段。
What would be the best approach this? 这最好的方法是什么? Many thanks!
非常感谢!
Vectorized solution: 矢量化解决方案:
In [65]: df['Text'] = df.filter(regex='^Text\d+').add(', ').sum(1).str.rstrip(', ')
In [66]: df
Out[66]:
ID1 ID2 Day Text1 Text2 Text3 Text
0 111 A 1 a b c a, b, c
1 222 B 2 i j k i, j, k
2 333 C 3 x y z x, y, z
Your code is very close. 你的代码非常接近。 You just need to use
apply
on a df[text_cols]
where text_cols is a list of the columns you want to merge into a new one. 您只需要在
df[text_cols]
上使用apply
,其中text_cols是要合并到新列的列的列表。
df['Text'] = df[text_cols].apply(lambda x: ''.join(x), axis=1)
There is also a vectorized join
: 还有一个矢量化
join
:
>>> df['Text'] = df.filter(regex='^Text\d+').sum(1).str.join(', ')
>>> df
ID1 ID2 Day Text1 Text2 Text3 Text
0 111 A 1 a b c a, b, c
1 222 B 2 i j k i, j, k
2 333 C 3 x y z x, y, z
The other solutions are awesome, and I would like offer an answer that uses the cat() function. 其他解决方案很棒,我想提供一个使用cat()函数的答案。
df['text'] = df[0].str.cat([df[i] for i in df.columns[1:]],sep=',')
Hope it helps : ) 希望能帮助到你 : )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.