简体   繁体   English

比较两个 df 并将值从一个 df 填充到另一个

[英]Compare two df's and populate value from one df to another

I would like to compare the amount_spent column in df1 and df2 and if the amount_spent column in df1 is null (null is a string not nan) then populate the value from df2 to df1 for that particular customer_id .我想比较 df1 和 df2 中的amount_spent列,如果 df1 中的amount_spent列是null (null 是字符串而不是 nan),然后为该特定customer_id填充从 df2 到 df1 的值。

df1 df1

customer_id  amount_spent 
3021         144
0535         042
7532         null 
2131         932

df2 df2

3021         144
0535         042
7532         945 

Desired output df所需 output df

3021         144
0535         042
7532         945
import pandas as pd
from numpy import nan

data_1 = [['3021', '144'], ['0535', '042'], ['7532', nan]]
data_2 = [['3021', '144' ], ['0535', '042'], ['7532', '945']]

df_1 = pd.DataFrame(data_1, columns = ['customer_id', 'amount_spent'])
df_2 = pd.DataFrame(data_2, columns = ['customer_id', 'amount_spent'])

print(df_1.fillna(df_2))

output output

customer_id amount_spent
0        3021          144
1        0535          042
2        7532          945

Try this:尝试这个:

import pandas as pd

df1 = pd.DataFrame({"customer_id": ['3021', '0535', '7532'], "amount_spent": ['144', '042', 'null']})
df2 = pd.DataFrame({"customer_id": ['3021', '0535', '7532'], "amount_spent": ['144', '042', '945']})

null_list = df1.index[df1['amount_spent'] == 'null'].tolist()

for null in null_list:
    df1["amount_spent"][int(null)] = df2["amount_spent"][int(null)]

It creates a list with all indices that underly the condition and populates the right value.它创建一个列表,其中包含条件下的所有索引并填充正确的值。

You can try:你可以试试:

import pandas as pd

df1 = pd.DataFrame({"customer_id": ['3021', '0535', '7532'], "amount_spent": ['144', '042', 'null']})
df2 = pd.DataFrame({"customer_id": ['3021', '0535', '7532'], "amount_spent": ['144', '042', '945']})


df1 = df1.set_index("customer_id")
df2 = df2.set_index("customer_id")
df1.loc[df1['amount_spent'] == "null", 'amount_spent'] = df2['amount_spent']
df1 = df1.reset_index()
print(df1)

It gives:它给:

  customer_id amount_spent
0        3021          144
1        0535          042
2        7532          945

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM