简体   繁体   中英

How to convert multiple columns to single column?

I have a onehot-encoded columns,df with zeros as "nan". I'm trying to convert onehot encoded columns to a single column.

Assume the below dataframe, df

    p1   |   p2  |   p3   |  p4   |  p5   |
---------------------------------------
0   cat     nan     nan     nan      nan
1   nan     dog     nan     nan      nan
2   nan     nan     horse   nan      nan
3   nan     nan     nan     donkey   nan
4   nan     nan     nan     nan      pig   

Required Output

    animals
-----------------
0   cat
1   dog
2   horse
3   donkey
4   pig

If there is always only one non missing value per rows use forward filling missing values (like DataFrame.fillna with method='ffill' ) and then select last column by position with DataFrame.iloc , also for one column DataFrame add Series.to_frame :

df = df.ffill(axis=1).iloc[:, -1].to_frame('new')
print (df)
      new
0     cat
1     dog
2   horse
3  donkey
4     pig

If possible more non missing values per rows use DataFrame.stack with join per first level:

print (df)
    p1   p2     p3      p4    p5
0  cat  NaN    NaN     NaN  lion
1  NaN  dog    NaN     NaN   NaN
2  NaN  NaN  horse     NaN   NaN
3  NaN  NaN    NaN  donkey   NaN
4  NaN  NaN    NaN     NaN   pig

df2 = df.stack().groupby(level=0).apply(', '.join).to_frame('new')
print (df2)
         new
0  cat, lion
1        dog
2      horse
3     donkey
4        pig

Or lambda function:

df2 = df.apply(lambda x: x.dropna().str.cat(sep=', '), axis=1).to_frame('new')
print (df2)
         new
0  cat, lion
1        dog
2      horse
3     donkey
4        pig

If you have one word per row you can fill NaN with empty strings and sum by row:

df.fillna('').sum(axis=1)

Result:

0       cat
1       dog
2     horse
3    donkey
4       pig
dtype: object

Silly but working. Not sure what you expect if you have >1 not NA for the same index.

df['animals'] = df[df.columns[0]]
for c in df.columns[1:]:
   df['animals'].fillna(df[c], inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM