简体   繁体   中英

How to create a new column that returns value from another table under certain criteria in Python

I have two table:

  • Account Balance

It has three columns: Account_ID, Date, Balance_amount

  • Account Transaction

It has three columns: Account_ID, date, transaction_amount

These two stables have different rows, and not every account has transaction amount. so i want to create a new column in Account Balance table called transaction_amount that returns transaction_amount if that account is shown in account transaction table otherwise return 0. I tried this np.where(data1.account_id.isin(data2._account_id), data2.amount,0) but it says operands could not be broadcast together with shapes (123171,) (668306,) () How can i solve this in Python?

I'm assuming you're using pandas.

If you have multiple transaction_amounts per account_id in data2, using merge is probably your best bet:

data1.merge(data2, on='account_id', how='left')

This will return np.nan for account_ids in data1 but not in data2. It will also return both date columns--from data1 and data2. And it will return a row for each transaction_amount per account_id.

If there is only one transaction_amount per account_id, you can convert data2 to a dictionary and have it mapped to data1 like this:

data2_dict = data2.set_index('account_id).to_dict()['transaction_amount']
data1['transaction_amount'] = data1['account_id'].map(data2_dict)

You'll also get np.nan for account_ids in data1 but not data2.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM