[英]How can I create a column in an actual dataframe by indexing another dataframe using the values in two columns from the actual dataframe
Good day.再会。 I have two dataset (df1,df2).
我有两个数据集(df1,df2)。 I am trying to fill the column 'values' in df2 by using the column 'site_before' as the row index in df1 and the column 'site' as the column index in df1.
我试图通过使用列“site_before”作为 df1 中的行索引和使用列“站点”作为 df1 中的列索引来填充 df2 中的列“值”。
The dataset df1:数据集df1:
ANA01 PHO01 ATL BAL12 BOS07
ANA01 0 0 3 3 3
PHO01 0 0 3 3 3
ATL -3 -3 0 0 0
BAL12 -3 -3 0 0 0
BOS07 -3 -3 0 0 0
"The first column is the indexes of the rows" “第一列是行的索引”
The dataset df2:数据集df2:
Game_ID site_before site values
1 ANA199804010 ANA01 ANA01
3 ANA199804020 ANA01 ATL
5 ANA199804030 ANA01 BAL12
7 ANA199804040 ANA01 BOS07
9 ANA199804050 ANA01 ANA01
674 BOS199804300 BOS07 BOS07
31 ANA199805010 BOS07 ANA01
33 ANA199805020 PHO01 ANA01
35 ANA199805030 PHO01 PHO01
37 ANA199805040 PHO01 ATL
39 ANA199805050 PHO01 BAL12
I tried to do:我试图做:
df2['values'] = df1.loc[df2['site_before'], df2['site']].values
but I got an error as ValueError: Wrong number of items passed 4864, placement implies 1但我收到了一个错误,因为 ValueError: Wrong number of items passed 4864,placement意味着 1
The result I am expecting is:我期待的结果是:
Game_ID site_before site values
1 ANA199804010 ANA01 ANA01 0
3 ANA199804020 ANA01 ATL 3
5 ANA199804030 ANA01 BAL12 3
7 ANA199804040 ANA01 BOS07 3
9 ANA199804050 ANA01 ANA01 0
674 BOS199804300 BOS07 BOS07 0
31 ANA199805010 BOS07 ANA01 -3
33 ANA199805020 PHO01 ANA01 0
35 ANA199805030 PHO01 PHO01 0
37 ANA199805040 PHO01 ATL 3
39 ANA199805050 PHO01 BAL12 3
Use DataFrame.join
with new MultiIndex Series
created by DataFrame.stack
:将
DataFrame.join
与由DataFrame.stack
创建的新MultiIndex Series
DataFrame.stack
:
df2 = df2.join(df1.stack().rename('new').rename_axis(('site_before','site')),
on=['site_before','site'])
print (df2)
Game_ID site_before site new
1 ANA199804010 ANA01 ANA01 0
3 ANA199804020 ANA01 ATL 3
5 ANA199804030 ANA01 BAL12 3
7 ANA199804040 ANA01 BOS07 3
9 ANA199804050 ANA01 ANA01 0
674 BOS199804300 BOS07 BOS07 0
31 ANA199805010 BOS07 ANA01 -3
33 ANA199805020 PHO01 ANA01 0
35 ANA199805030 PHO01 PHO01 0
37 ANA199805040 PHO01 ATL 3
39 ANA199805050 PHO01 BAL12 3
Alternative is use DataFrame.melt
with DataFrame.merge
and left join:替代方法是将
DataFrame.melt
与DataFrame.merge
和左连接一起使用:
df3 = df1.rename_axis('site_before').reset_index().melt('site_before', var_name='site')
df2 = df2.merge(df3, how='left')
print (df2)
Game_ID site_before site new
0 ANA199804010 ANA01 ANA01 0
1 ANA199804020 ANA01 ATL 3
2 ANA199804030 ANA01 BAL12 3
3 ANA199804040 ANA01 BOS07 3
4 ANA199804050 ANA01 ANA01 0
5 BOS199804300 BOS07 BOS07 0
6 ANA199805010 BOS07 ANA01 -3
7 ANA199805020 PHO01 ANA01 0
8 ANA199805030 PHO01 PHO01 0
9 ANA199805040 PHO01 ATL 3
10 ANA199805050 PHO01 BAL12 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.