[英]pandas replace column values except one
Original dataframe:原始数据框:
DocID DocURL DocName SiteURL LibraryURL
0 29806 path/to/doc/docname1.doc docname1 web/url lib/url
1 29807 path/to/doc/docname2.doc docname2 web/url lib/url
New dataframe:新数据框:
DocURL DocName SiteURL LibraryURL
0 path/to/doc/newname.doc newname web/url lib/url
I want to replace the row with DocID == 29806 with this new row.我想用这个新行替换 DocID == 29806 的行。
I have tried doing it by using following code without success:我曾尝试使用以下代码进行操作,但没有成功:
df.loc[:, df.columns != 'DocID'].loc[row_index] = new_df.iloc[0]
And this:和这个:
df.loc[row_index][1:] = new_df.iloc[0]
For the first one I don't get any error or warning, for the next one I get:对于第一个我没有收到任何错误或警告,对于下一个我得到:
A value is trying to be set on a copy of a slice from a DataFrame
试图在来自 DataFrame 的切片副本上设置值
Now, I want / need the row in the original dataframe to be replaced with the row of the new dataframe, but I need the DocID to be kept the same.现在,我希望/需要将原始数据帧中的行替换为新数据帧的行,但我需要保持 DocID 不变。 I also need the result to be stored in the original dataframe.
我还需要将结果存储在原始数据框中。
One way could be to create a list of columns to replace and then use to_numpy
to avoid any alignment issue like:一种方法是创建要替换的列列表,然后使用
to_numpy
来避免任何对齐问题,例如:
cols_replace = ['DocURL','DocName','SiteURL','LibraryURL']
df.loc[row_index, cols_replace] = new_df.loc[0, cols_replace].to_numpy()
Just use df.update()
to get what you need.只需使用
df.update()
即可获得所需内容。
Code:代码:
df=pd.DataFrame({'DocID':[29806,29807],'DocURL':['path/to/doc/docname1.doc','path/to/doc/docname2.doc'],
'DocName':['docname1','docname2'],'SiteURL':['web/url','web/url'],
'LibraryURL':['lib/url','lib/url']})
df2=pd.DataFrame({'DocURL':['path/to/doc/newname.doc'],
'DocName':['newname'],'SiteURL':['web/url'],
'LibraryURL':['lib/url']})
df.update(df2)
Output:输出:
DocID DocURL DocName SiteURL LibraryURL
0 29806 path/to/doc/newname.doc newname web/url lib/url
1 29807 path/to/doc/docname2.doc docname2 web/url lib/url
df.update()
in this case will update your original values in df
with the new values in df2
.在这种情况下,
df.update()
将使用df2
的新值更新df
的原始值。 The update will be done based on index.更新将基于索引完成。 So make sure the index numbers in
df2
matches those in df
.因此,请确保
df2
中的索引号与df
的索引号匹配。
尝试这个:
df.loc[df['DocID'] == '29806', ['DocURL', 'DocName', 'SiteURL', 'LibraryURL']] = dfNew.iloc[0]['DocURL', 'DocName', 'SiteURL', 'LibraryURL']
new_df["DocID"] = [29806]
old_df.set_index("DocID")
new_df.set_index("DocID")
old_df.update(new_df)
Your best bet is to add a DocID
column to your new dataframe and populate it with the DocIDs from the old dataframe you'd like to update.最好的办法是将
DocID
列添加到新数据框中,并使用您要更新的旧数据框中的 DocID 填充它。 Then, set DocID
to be the index.然后,将
DocID
设置为索引。 Finally, calling .update
defaults to aligning on indices, and the behavior is fully controlled.最后,调用
.update
默认对齐索引,并且行为是完全受控的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.