繁体   English   中英

如何将列列表合并为 pandas dataframe 中的一列?

[英]how to concat list of columns into one column in pandas dataframe?

我有列列表,我需要将它们组合成 dataframe 中的一列。 你能帮我怎么做吗?

例子:

example-1 column_list = ['File Type', 'Number of Records']
          df['pk'] = df['File Type'] + df['Number of Records']


example-2 column_list = ['File Type', 'Number of Records', 'Indication']
          df['pk'] = df['File Type'] + df['Number of Records'] + df['Indication']


example-3 column_list = ['File Type']
          df['pk'] = df['File Type']

Select 首先按column_list列,然后通过DataFrame.astype将值转换为字符串,如果至少有一个非字符串列,最后加入DataFrame.agg

df['pk'] = df[column_list].astype(str).agg(''.join, axis=1)

或者在DataFrame.apply中:

df['pk'] = df[column_list].astype(str).apply(''.join, axis=1)

样品

df = pd.DataFrame({'File Type':['aa','bb'], 
                   'Number of Records':[1,5],
                   'Indication':['ind1','ind2']})

column_list1 = ['File Type', 'Number of Records']
column_list2 = ['File Type', 'Number of Records', 'Indication']
column_list3 = ['File Type']
df['pk1'] = df[column_list1].astype(str).agg(''.join, axis=1)
df['pk2'] = df[column_list2].astype(str).agg(''.join, axis=1)
df['pk3'] = df[column_list3].astype(str).agg(''.join, axis=1)
print (df)
  File Type  Number of Records Indication  pk1      pk2 pk3
0        aa                  1       ind1  aa1  aa1ind1  aa
1        bb                  5       ind2  bb5  bb5ind2  bb

另一个想法是使用sum

df['pk1'] = df[column_list1].astype(str).sum(axis=1)
df['pk2'] = df[column_list2].astype(str).sum(axis=1)
df['pk3'] = df[column_list3].astype(str).sum(axis=1)
print (df)
  File Type  Number of Records Indication  pk1      pk2 pk3
0        aa                  1       ind1  aa1  aa1ind1  aa
1        bb                  5       ind2  bb5  bb5ind2  bb

如果连接数字列, sum解决方案的问题是将 output 转换为浮点数:

df = pd.DataFrame({'File Type':[4,5], 'Number of Records':[1,5], 'Indication':[8,9]})

column_list1 = ['File Type', 'Number of Records']
column_list2 = ['File Type', 'Number of Records', 'Indication']
column_list3 = ['File Type']
df['pk1'] = df[column_list1].astype(str).sum(axis=1)
df['pk2'] = df[column_list2].astype(str).sum(axis=1)
df['pk3'] = df[column_list3].astype(str).sum(axis=1)
print (df)
   File Type  Number of Records  Indication   pk1    pk2  pk3
0          4                  1           8  41.0  418.0  4.0
1          5                  5           9  55.0  559.0  5.0

print (df.dtypes)
File Type              int64
Number of Records      int64
Indication             int64
pk1                  float64
pk2                  float64
pk3                  float64
dtype: object

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM