简体   繁体   中英

Merge two pandas series based on missing data

If I have:

    col1       col2
0   1          np.nan
1   2          np.nan
2   np.nan     3
4   np.nan     4

How would I efficiently get to:

    col1       col2     col3
0   1          np.nan   1
1   2          np.nan   2
2   np.nan     3        3
4   np.nan     4        4

My current solution is:

test = pd.Series([1,2,np.nan, np.nan])

test2 = pd.Series([np.nan, np.nan, 3,4])

temp_df = pd.concat([test, test2], axis = 1)


init_cols = list(temp_df.columns)

temp_df['test3'] = ""

for col in init_cols:
    temp_df.ix[temp_df[col].fillna("") != "", 'test3'] = list(temp_df.ix[temp_df[col].fillna("") != "", col])

Ideally I would like to avoid the use of loops.

It depends on what you want to do in the event that each column has a non-null value.

take col1 first then fill missing with col2

df['col3'] = df.col1.fillna(df.col2)

take col2 first then fill missing with col1

df['col3'] = df.col2.fillna(df.col1)

average the overlap

df['col3'] = df.mean(1)

sum the overlap

df['col3'] = df.sum(1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM