[英]Pandas lookup/pivot using column headings
I have a table containing watershed IDs and land cover classes: 我有一张包含分水岭ID和土地覆被类别的表:
WatershedID LandCover
2 Corn
8 Corn
2 Soy
8 Soy
and a separate lookup table which contains the area for each watershed/land cover combination: 还有一个单独的查找表,其中包含每个分水岭/土地覆盖组合的面积:
WatershedID Corn Soy
2 14 1
3 2 14
5 18 8
7 21 2
8 6 31
What I would like to do is to append a column to the first table which contains the corresponding row/column value in the lookup table, like so: 我想做的是向第一张表追加一个列,该列包含查找表中相应的行/列值,如下所示:
WatershedID LandCover Area
2 Corn 14
8 Corn 6
2 Soy 1
8 Soy 31
I've managed to do this by iterating with a for loop: 我设法通过for循环进行迭代:
areas = []
for watershed_id, land_cover in tableA.iterrows():
areas.append(tableB.loc[watershed_id][land_cover]
but given the size of my tables, this is slow. 但是鉴于我的桌子的大小,这很慢。 Is there a faster way to do this that doesn't involve iteration?
有没有一种不涉及迭代的更快方法? I've been experimenting with MultiIndexing and pivot tables, but nothing has worked so far.
我一直在尝试使用MultiIndexing和数据透视表,但到目前为止没有任何效果。
You can use unstack
with merge
: 您可以将
unstack
与merge
一起使用:
df3 = df2.set_index('WatershedID').unstack().reset_index()
df3.columns = ['LandCover','WatershedID','Area']
print (df3)
LandCover WatershedID Area
0 Corn 2 14
1 Corn 3 2
2 Corn 5 18
3 Corn 7 21
4 Corn 8 6
5 Soy 2 1
6 Soy 3 14
7 Soy 5 8
8 Soy 7 2
9 Soy 8 31
print (pd.merge(df1,df3))
WatershedID LandCover Area
0 2 Corn 14
1 8 Corn 6
2 2 Soy 1
3 8 Soy 31
If there are more same columns you need specify columns for join: 如果有更多相同的列,则需要指定用于连接的列:
print (pd.merge(df1,df3, on=['WatershedID','LandCover']))
WatershedID LandCover Area
0 2 Corn 14
1 8 Corn 6
2 2 Soy 1
3 8 Soy 31
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.