如何将 dataframe 行合并为单行，每列的所有行值都集中在一起？

Question

I have a df like:我有一个像这样的df：

  | col1  | col2   | col3
0 | Text1 | a,b ,c | klra-tk³,t54 ? 
1 | Text2 | NaN    | gimbal3, gimbal4
2 | Text3 | a,k,m  | NaN

I want to get a single row with all unique values of a column in a single line and NaNs ignored like:我想在一行中获得一列的所有唯一值的单行，并且忽略 NaN，例如：

  | col1                | col2      | col3
0 | Text1, Text2, Text3 | a,b,c,k,m | klra-tk³,t54,gimbal3, gimbal4

How can I do this with pandas?我怎么能用 pandas 做到这一点？

Answer 1

Use custom function with Series.str.split , DataFrame.stack , reove duplicates by Series.drop_duplicates and remove missing values by Series.dropna , last join by , and convert Series to one row DataFrame by Series.to_frame and transpose: Use custom function with Series.str.split , DataFrame.stack , reove duplicates by Series.drop_duplicates and remove missing values by Series.dropna , last join by , and convert Series to one row DataFrame by Series.to_frame and transpose:

f = lambda x: ','.join(x.str.split(',', expand=True).stack().drop_duplicates().dropna())
df = df.apply(f).to_frame().T
print (df)
                col1       col2                         col3
0  Text1,Text2,Text3  a,b,c,k,m  klra-tk,t54,gimbal3,gimbal4

Or use list comprehension like:或使用列表理解，如：

f = lambda x: ','.join(x.str.split(',', expand=True).stack().drop_duplicates().dropna())
df = pd.DataFrame([[f(df[x]) for x in df.columns]], columns=df.columns)

如何将 dataframe 行合并为单行，每列的所有行值都集中在一起？

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-08-14 09:32:09

如何将 dataframe 行合并为单行，每列的所有行值都集中在一起？

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-08-14 09:32:09

解决方案1
2 已采纳 2020-08-14 09:32:09