I have one DataFrame called limits_df with the schema:
"County Name" "State" "One-Unit Limit"
This looks like:
data1 = {'County Name': ["A", "B", "C", "D"], 'State': ['AA', 'AB', 'AA', 'AC'], 'One-Unit Limit': [100, 200, 150, 300]}
limits_df = pd.DataFrame.from_dict(data1)
And I have another DataFrame called loans_df with the schema:
county state price
This looks like:
data2 = {'county': ["B", "C", "A", "E"], 'state': ['AB', 'AC', 'AA', 'AF'], 'price': [300, 200, 150, 300]}
loans_df = pd.DataFrame.from_dict(data2)
I want to create a new column in loans_df["jumbo"] which is True when the loan price is greater than the limit in its corresponding county. In code that would be:
county_limit = limits_df.loc[ (limits_df["County Name"] == str(loans_df["county"])) & (limits_df["State"] == str(loans_df["state"])) ]["One-Unit Limit"].item()
loan_price = loans_df["price"].item()
if(loan_price > county_limit):
loans_df["jumbo"] = True
else:
loans_df["jumbo"] = False
Doing this in a iterrows
takes a really long time since I need to create loans_df["jumbo"] and then change what should be immutable data. Isn't there a simpler way to do this with a apply()
or map()
?
IIUC, you could use
df2 = loans_df.merge(limits_df[['State', 'County Name', 'One-Unit Limit']], how='left',
left_on=['state', 'county'], right_on=['State', 'County Name'])
df2['jumbo'] = df2['price'] > df2['One-Unit Limit']
Where you use pd.merge
with left-join to match a limit to every loan by State and County. Then you can immediately a boolean comparison to check whether jumbo
is True
or False
.
Note that when there is no Limit for a state/county found, it outputs False
in Jumbo.
This assumes that all counties and states in limits_df
are found in loans_df
loans_df['jumbo'] = pd.merge(limits_df, loans_df,
left_on=['County Name', 'State'],
right_on=['county', 'state'], how='left') \
.apply(lambda x: x['price'] > x['One-Unit Limit'], axis=1)
m=limits_df.merge(loans_df,left_on=['County Name','State'],right_on=['county','state'])
loans_df["jumbo"]=loans_df['county'].isin(m.loc[m['price']>m['One-Unit Limit'],'County Name'])
print(loans_df)
county state price jumbo
0 B AB 300 True
1 C AC 200 False
2 A AA 150 True
3 E AF 300 False
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.