[英]Replace values of a column from another dataframe values based on a condition - Python
My problem is as following:我的问题如下:
I have a first dataframe (df1):我有第一个 dataframe (df1):
Client客户 | Ticket票 | Region地区 | Profile轮廓 |
---|---|---|---|
BCA BCA | 1234 1234 | US我们 | Shanon香农 |
ERROR错误 | 3453 3453 | EMEA欧洲、中东和非洲 | Laura劳拉 |
RZ RZ | 7988 7988 | EUROPE欧洲 | Mitch米奇 |
ERROR错误 | 2364 2364 | US我们 | James詹姆士 |
Trp色氨酸 | 3429 3429 | MX MX | Roger罗杰 |
This is my second datafrane (df2)这是我的第二个 datafrane (df2)
Client客户 | Ticket票 |
---|---|
HHA HHA | 3534 3534 |
Alphabet字母 | 3453 3453 |
HP生命值 | 2355 2355 |
AMD AMD | 2364 2364 |
I would like to replace the 'ERROR' values on the 'Client' column of df1 using the values of df2 'Client' column, but the thing is that I'm having issues to do it based on the 'Ticket' values: In this example, the first ERROR should be replaced by Alphabet based on Ticket 3453 , the second ERROR should be replaced by AMD based on Ticket 2364 .我想使用 df2 'Client' 列的值替换 df1 的 'Client' 列上的 'ERROR' 值,但问题是我遇到了基于 'Ticket' 值的问题:在这个例子中,第一个ERROR应该被替换为基于 Ticket 3453的Alphabet ,第二个ERROR应该被替换为基于 Ticket 2364的AMD 。
Finally, the desired output should be something like this:最后,所需的 output 应该是这样的:
Client客户 | Ticket票 | Region地区 | Profile轮廓 |
---|---|---|---|
BCA BCA | 1234 1234 | US我们 | Shanon香农 |
Alphabet字母 | 3453 3453 | EMEA欧洲、中东和非洲 | Laura劳拉 |
RZ RZ | 7988 7988 | EUROPE欧洲 | Mitch米奇 |
AMD AMD | 2364 2364 | US我们 | James詹姆士 |
Trp色氨酸 | 3429 3429 | MX MX | Roger罗杰 |
data = df1.merge(df2, on='Ticket', how='left')
data.loc[data.Client_x.eq("ERROR"), "Client_x"] = data.Client_y
data.drop(columns=['Client_y']).rename(columns={'Client_x': 'Client'})
You can assign with map
您可以使用map
进行分配
df1.loc[df1['Client'].eq('ERROR'),'Client'] = df1['Ticket'].map(df2.set_index('Ticket')['Client'])
df1
Out[192]:
Client Ticket Region Profile
0 BCA 1234 US Shanon
1 Alphabet 3453 EMEA Laura
2 RZ 7988 EUROPE Mitch
3 AMD 2364 US James
4 Trp 3429 MX Rog
Here's a way:这里有一个方法:
df1 = df1.set_index('Ticket')
df1.loc[df1.Client=='ERROR','Client'] = df2.set_index('Ticket').Client
df1.Client = df1.Client.fillna('ERROR')
df1 = df1.reset_index()[['Client', 'Ticket'] + [col for col in df1.columns if col != 'Client']]
Input:输入:
df1
Client Ticket Region Profile
0 BCA 1234 US Shanon
1 ERROR 3453 EMEA Laura
2 RZ 7988 EUROPE Mitch
3 ERROR 2364 US James
4 Trp 3429 MX Roger
5 ERROR 9999 US James
df2
Client Ticket
0 HHA 3534
1 Alphabet 3453
2 HP 2355
3 AMD 2364
Output: Output:
Client Ticket Region Profile
0 BCA 1234 US Shanon
1 Alphabet 3453 EMEA Laura
2 RZ 7988 EUROPE Mitch
3 AMD 2364 US James
4 Trp 3429 MX Roger
5 ERROR 9999 US James
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.