[英]How to do column string concatenation including space separator in Pandas dataframe?
我是一個Pandas DataFrame如下:
df = pd.DataFrame({
'id': [1,2 ,3],
'txt1': ['Hello there1', 'Hello there2', 'Hello there3'],
'txt2': ['Hello there4', 'Hello there5', 'Hello there6'],
'txt3': ['Hello there7', 'Hello there8', 'Hello there9']
})
df
id txt1 txt2 txt3
1 Hello there1 Hello there4 Hello there7
2 Hello there2 Hello there5 Hello there8
3 Hello there3 Hello there6 Hello there9
我想連接列txt1
, txt2
和txt3
。 到目前為止,我能夠實現如下:
df['alltext'] = df['txt1'] + df['txt2'] + df['txt3']
df
id txt1 txt2 txt3 alltext
1 Hello there1 Hello there4 Hello there7 Hello there1Hello there4Hello there7
2 Hello there2 Hello there5 Hello there8 Hello there2Hello there5Hello there8
3 Hello there3 Hello there6 Hello there9 Hello there3Hello there6Hello there9
但是如何在Pandas中連接時在兩個列字符串之間引入空格字符?
我剛剛開始學習熊貓。
您還可以在列之間添加分隔符:
df['alltext'] = df['txt1'] + ' ' + df['txt2'] + ' ' + df['txt3']
或者僅使用DataFrame.filter
過濾列名稱中包含txt
的列,並使用apply
join
每行:
df['alltext'] = df.filter(like='txt').apply(' '.join, 1)
或者通過DataFrame.select_dtypes
僅過濾對象列 - 大多數情況下,具有DataFrame.select_dtypes
對象的Series
將成為string
- 但它可以是任何Python object
:
df['alltext'] = df.select_dtypes('object').apply(' '.join, 1)
或者按位置選擇列 - 所有列DataFrame.iloc
:
df['alltext'] = df.iloc[:, 1:].apply(' '.join, 1)
謝謝@Jon Clements的解決方案,以便用txt
和numeric更好地匹配列名:
df['alltext'] = df.filter(regex=r'^txt\d+$').apply(' '.join, 1)
只需在它之間添加空間 ,
df['alltext'] = df['txt1'] + ' ' + df['txt2'] + ' ' + df['txt3']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.