[英]How to replace part of dataframe in pandas
I have sample dataframe like this 我有这样的样本数据帧
df1=
A B C
a 1 2
b 3 4
b 5 6
c 7 8
d 9 10
I would like to replace a part of this dataframe (col A=a and b) with this dataframe 我想用这个数据帧替换这个数据帧的一部分(col A = a和b)
df2=
A B C
b 9 10
b 11 12
c 13 14
I would like to get result below 我想得到以下结果
df3=
A B C
a 1 2
b 9 10
b 11 12
c 13 14
d 9 10
I tried 我试过了
df1[df1.A.isin("bc")]...
But I couldnt figure out how to replace. 但我无法弄清楚如何更换。 someone tell how to replace dataframe. 有人告诉如何替换数据帧。
As what I explained try update
. 正如我所解释的那样尝试update
。
import pandas as pd
df1 = pd.DataFrame({"A":['a','b','b','c'], "B":[1,2,4,6], "C":[3,2,1,0]})
df2 = pd.DataFrame({"A":['b','b','c'], "B":[100,400,300], "C":[39,29,100]}).set_index(df1.loc[df1.A.isin(df2.A),:].index)
df1.update(df2)
Out[75]:
A B C
0 a 1.0 3.0
1 b 100.0 39.0
2 b 400.0 29.0
3 c 300.0 100.0
You need combine_first
or update
by column A
, but because duplicates need cumcount
: 您需要combine_first
或按列A
update
,但因为重复项需要cumcount
:
df1['g'] = df1.groupby('A').cumcount()
df2['g'] = df2.groupby('A').cumcount()
df1 = df1.set_index(['A','g'])
df2 = df2.set_index(['A','g'])
df3 = df2.combine_first(df1).reset_index(level=1, drop=True).astype(int).reset_index()
print (df3)
A B C
0 a 1 2
1 b 9 10
2 b 11 12
3 c 13 14
4 d 9 10
Another solution: 另一种方案:
df1['g'] = df1.groupby('A').cumcount()
df2['g'] = df2.groupby('A').cumcount()
df1 = df1.set_index(['A','g'])
df2 = df2.set_index(['A','g'])
df1.update(df2)
df1 = df1.reset_index(level=1, drop=True).astype(int).reset_index()
print (df1)
A B C
0 a 1 2
1 b 9 10
2 b 11 12
3 c 13 14
4 d 9 10
If duplicatesof column A
in df1
are same in df2
and have same length: 如果df1
中列A
重复项在df2
中相同且长度相同:
df2.index = df1.index[df1.A.isin(df2.A)]
df3 = df2.combine_first(df1)
print (df3)
A B C
0 a 1.0 2.0
1 b 9.0 10.0
2 b 11.0 12.0
3 c 13.0 14.0
4 d 9.0 10.0
you could solve your problem with the following: 你可以用以下方法解决你的问题:
import pandas as pd
df1 = pd.DataFrame({'A':['a','b','b','c','d'],'B':[1,3,5,7,9],'C':[2,4,6,8,10]})
df2 = pd.DataFrame({'A':['b','b','c'],'B':[9,11,13],'C':[10,12,14]}).set_index(df1.loc[df1.A.isin(df2.A),:].index)
df1.loc[df1.A.isin(df2.A), ['B', 'C']] = df2[['B', 'C']]
Out[108]:
A B C
0 a 1 2
1 b 9 10
2 b 11 12
3 c 13 14
4 d 9 10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.