简体   繁体   中英

Merging two columns in a pandas DataFrame

Given the following DataFrame :

      A     B
0 -10.0   NaN
1   NaN  20.0
2 -30.0   NaN

I want to merge columns A and B , filling the NaN cells in column A with the values from column B and then drop column B , resulting in a DataFrame like this:

     A
0 -10.0
1  20.0
2 -30.0

I have managed to solve this problem by using the iterrows() function.

Complete code example:

import numpy as np
import pandas as pd

example_data = [[-10, np.NaN], [np.NaN, 20], [-30, np.NaN]]

example_df = pd.DataFrame(example_data, columns = ['A', 'B'])

for index, row in example_df.iterrows():
    if pd.isnull(row['A']):
        row['A'] = row['B']

example_df = example_df.drop(columns = ['B'])        

example_df

This seems to work fine, but I find this information in the documentation for iterrows() :

You should never modify something you are iterating over.

So it seems like I'm doing it wrong.

What would be a better/recommended approach for achieving the same result?

Use Series.fillna with Series.to_frame :

df = df['A'].fillna(df['B']).to_frame()
#alternative
#df = df['A'].combine_first(df['B']).to_frame()
print (df)
      A
0 -10.0
1  20.0
2 -30.0

If more columns and need first non missing values per rows use back filling missing values with select first column by one element list for one column DataFrame :

df = df.bfill(axis=1).iloc[:, [0]]
print (df)
      A
0 -10.0
1  20.0
2 -30.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM