![](/img/trans.png)
[英]Concat columns all but one column into a list in a new column pandas
[英]how to concat list of columns into one column in pandas dataframe?
我有列列表,我需要将它们组合成 dataframe 中的一列。 你能帮我怎么做吗?
例子:
example-1 column_list = ['File Type', 'Number of Records']
df['pk'] = df['File Type'] + df['Number of Records']
example-2 column_list = ['File Type', 'Number of Records', 'Indication']
df['pk'] = df['File Type'] + df['Number of Records'] + df['Indication']
example-3 column_list = ['File Type']
df['pk'] = df['File Type']
Select 首先按column_list
列,然后通过DataFrame.astype
将值转换为字符串,如果至少有一个非字符串列,最后加入DataFrame.agg
:
df['pk'] = df[column_list].astype(str).agg(''.join, axis=1)
或者在DataFrame.apply
中:
df['pk'] = df[column_list].astype(str).apply(''.join, axis=1)
样品:
df = pd.DataFrame({'File Type':['aa','bb'],
'Number of Records':[1,5],
'Indication':['ind1','ind2']})
column_list1 = ['File Type', 'Number of Records']
column_list2 = ['File Type', 'Number of Records', 'Indication']
column_list3 = ['File Type']
df['pk1'] = df[column_list1].astype(str).agg(''.join, axis=1)
df['pk2'] = df[column_list2].astype(str).agg(''.join, axis=1)
df['pk3'] = df[column_list3].astype(str).agg(''.join, axis=1)
print (df)
File Type Number of Records Indication pk1 pk2 pk3
0 aa 1 ind1 aa1 aa1ind1 aa
1 bb 5 ind2 bb5 bb5ind2 bb
另一个想法是使用sum
:
df['pk1'] = df[column_list1].astype(str).sum(axis=1)
df['pk2'] = df[column_list2].astype(str).sum(axis=1)
df['pk3'] = df[column_list3].astype(str).sum(axis=1)
print (df)
File Type Number of Records Indication pk1 pk2 pk3
0 aa 1 ind1 aa1 aa1ind1 aa
1 bb 5 ind2 bb5 bb5ind2 bb
如果连接数字列, sum
解决方案的问题是将 output 转换为浮点数:
df = pd.DataFrame({'File Type':[4,5], 'Number of Records':[1,5], 'Indication':[8,9]})
column_list1 = ['File Type', 'Number of Records']
column_list2 = ['File Type', 'Number of Records', 'Indication']
column_list3 = ['File Type']
df['pk1'] = df[column_list1].astype(str).sum(axis=1)
df['pk2'] = df[column_list2].astype(str).sum(axis=1)
df['pk3'] = df[column_list3].astype(str).sum(axis=1)
print (df)
File Type Number of Records Indication pk1 pk2 pk3
0 4 1 8 41.0 418.0 4.0
1 5 5 9 55.0 559.0 5.0
print (df.dtypes)
File Type int64
Number of Records int64
Indication int64
pk1 float64
pk2 float64
pk3 float64
dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.