[英]Using one pandas dataframe to populate new column in another pandas dataframe
I have two dataframes.我有两个数据框。 The first dataframe is df_states
and the second dataframe is state_lookup
.第一个数据帧是df_states
,第二个数据帧是state_lookup
。
df_states
state code score
0 Texas 0 0.753549
1 Pennsylvania 0 0.998119
2 California 1 0.125751
3 Texas 2 0.125751
state_lookup
state code_0 code_1 code_2
0 Texas 2014 2015 2019
1 Pennsylvania 2015 2016 207
2 California 2014 2015 2019
I want to create a new column in df_states
called 'year' which is based off the 'code' column which is based off the state_lookup
table.我想在df_states
创建一个名为“year”的新列,它基于基于state_lookup
表的“code”列。 So for example, if Texas has a code = 0 then based off the state_lookup
df the year should be 2014. If Texas has a code = 2, then the year should be 2019.例如,如果德克萨斯州的代码 = 0,那么根据state_lookup
df,年份应该是 2014。如果德克萨斯州的代码 = 2,那么年份应该是 2019。
This is what the end result should look like:最终结果应该是这样的:
df_states
state code score year
0 Texas 0 0.753 2014
1 Pennsylvania 0 0.998 2015
2 California 1 0.125 2015
3 Texas 2 0.124 2019
I've tried using a for
loop to iterate through each row, but am unable to get it to work.我尝试使用for
循环遍历每一行,但无法使其工作。 How would you achieve this?你将如何实现这一目标?
You can first use wide_to_long
on your state_lookup
df so you can perform a merge
:您可以先在state_lookup
df 上使用wide_to_long
以便执行merge
:
s = pd.wide_to_long(state_lookup,stubnames="code",sep="_",i="state",j="year",suffix="\d").reset_index()
s.columns = ["state","code","year"] #rename the columns properly
print (df_states.merge(s, on=["state","code"],how="left"))
state code score year
0 Texas 0 0.753549 2014
1 Pennsylvania 0 0.998119 2015
2 California 1 0.125751 2015
3 Texas 2 0.125751 2019
Load dataframes加载数据帧
df_states = pd.DataFrame({'state':['Texas','Pennsylvania','California','Texas'],'code':[0,0,1,2], 'score':[0.753549,0.998119,0.125751,0.12575]})
state_lookup = pd.DataFrame({'state':['Texas','Pennsylvania','California'],'code_0': [2014,2015,2014],'code_1': [2015,2016,2017] , 'code_2': [2019,2017,2019]})
First use melt
to convert your code_
columns into rows首先使用melt
您转换code_
列成行
melted_lookup = pd.melt(state_lookup,
id_vars=['state'],
value_vars=[col for col in state_lookup.columns if col.startswith('code_')],
var_name='new_code',
value_name='year')
Then merge the two dataframes:然后合并两个数据帧:
df_states['new_code'] = "code_"+ df_states.code.astype('str')
df_states = pd.merge(df_states, melted_lookup, how = 'left', on =['new_code','state'])
# state code score new_code year
#0 Texas 0 0.753549 code_0 2014
#1 Pennsylvania 0 0.998119 code_0 2015
#2 California 1 0.125751 code_1 2017
#3 Texas 2 0.125750 code_2 2019
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.