简体   繁体   中英

Updating a DataFrame based on another DataFrame

Given DataFrame df :

    Id Sex  Group  Time  Time!
0  21   M      2  2.31    NaN
1   2   F      2  2.29    NaN

and update :

    Id Sex  Group  Time
0  21   M      2  2.36
1   2   F      2  2.09
2   3   F      1  1.79

I want to match on Id , Sex and Group and either update Time! with Time value (from the update df) if match, or insert if a new record.

Here is how I do it:

df = df.set_index(['Id', 'Sex', 'Group'])
update = update.set_index(['Id', 'Sex', 'Group'])

for i, row in update.iterrows():
    if i in df.index:  # update
        df.ix[i, 'Time!'] = row['Time']
    else:              # insert new record
        cols = up.columns.values 
        row = np.array(row).reshape(1, len(row))
        _ = pd.DataFrame(row, index=[i], columns=cols)
       df = df.append(_)

print df

              Time  Time!
Id Sex Group             
21 M   2      2.31   2.36
2  F   2      2.29   2.09
3  F   1      1.79    NaN

The code seem to work and my wished result matches with the above. However, I have noticed this behaving faultily on a big data set, with the conditional

if i in df.index:
    ...
else:
    ...

working obviously wrong (it would proceed to else and vice-verse where it shouldn't, I guess, this MultiIndex may be the cause somehow).

So my question is, do you know any other way, or a more robust version of mine, to update one df based on another df?

I think I would do this with a merge, and then update the columns with a where. First remove the Time column from up:

In [11]: times = up.pop('Time')  # up = the update DataFrame

In [12]: df1 = df.merge(up, how='outer')

In [13]: df1
Out[13]:
   Id Sex  Group  Time  Time!
0  21   M      2  2.31    NaN
1   2   F      2  2.29    NaN
2   3   F      1   NaN    NaN

Update Time if it's not NaN and Time! if it's NaN:

In [14]: df1['Time!'] = df1['Time'].where(df1['Time'].isnull(), times)

In [15]: df1['Time'] = df1['Time'].where(df1['Time'].notnull(), times)

In [16]: df1
Out[16]:
   Id Sex  Group  Time  Time!
0  21   M      2  2.31   2.36
1   2   F      2  2.29   2.09
2   3   F      1  1.79    NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM