简体   繁体   中英

pandas replace column values except one

Original dataframe:

    DocID   DocURL                       DocName    SiteURL LibraryURL
0   29806   path/to/doc/docname1.doc    docname1    web/url lib/url
1   29807   path/to/doc/docname2.doc    docname2    web/url lib/url

New dataframe:

    DocURL                   DocName    SiteURL LibraryURL
0   path/to/doc/newname.doc  newname    web/url lib/url

I want to replace the row with DocID == 29806 with this new row.

I have tried doing it by using following code without success:

df.loc[:, df.columns != 'DocID'].loc[row_index] = new_df.iloc[0]

And this:

df.loc[row_index][1:] = new_df.iloc[0]

For the first one I don't get any error or warning, for the next one I get:

A value is trying to be set on a copy of a slice from a DataFrame

Now, I want / need the row in the original dataframe to be replaced with the row of the new dataframe, but I need the DocID to be kept the same. I also need the result to be stored in the original dataframe.

One way could be to create a list of columns to replace and then use to_numpy to avoid any alignment issue like:

cols_replace = ['DocURL','DocName','SiteURL','LibraryURL']
df.loc[row_index, cols_replace] = new_df.loc[0, cols_replace].to_numpy()

Just use df.update() to get what you need.

Code:

df=pd.DataFrame({'DocID':[29806,29807],'DocURL':['path/to/doc/docname1.doc','path/to/doc/docname2.doc'],
                'DocName':['docname1','docname2'],'SiteURL':['web/url','web/url'],
                'LibraryURL':['lib/url','lib/url']})

df2=pd.DataFrame({'DocURL':['path/to/doc/newname.doc'],
                'DocName':['newname'],'SiteURL':['web/url'],
                'LibraryURL':['lib/url']})

df.update(df2)

Output:

    DocID   DocURL                   DocName       SiteURL  LibraryURL
0   29806   path/to/doc/newname.doc  newname       web/url  lib/url
1   29807   path/to/doc/docname2.doc docname2      web/url  lib/url

df.update() in this case will update your original values in df with the new values in df2 . The update will be done based on index. So make sure the index numbers in df2 matches those in df .

尝试这个:

df.loc[df['DocID'] == '29806', ['DocURL', 'DocName', 'SiteURL', 'LibraryURL']] = dfNew.iloc[0]['DocURL', 'DocName', 'SiteURL', 'LibraryURL']
new_df["DocID"] = [29806]

old_df.set_index("DocID")
new_df.set_index("DocID")

old_df.update(new_df)

Your best bet is to add a DocID column to your new dataframe and populate it with the DocIDs from the old dataframe you'd like to update. Then, set DocID to be the index. Finally, calling .update defaults to aligning on indices, and the behavior is fully controlled.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM