使用列标题进行熊猫查找/数据透视

Question

I have a table containing watershed IDs and land cover classes: 我有一张包含分水岭ID和土地覆被类别的表：

WatershedID LandCover
          2      Corn
          8      Corn
          2       Soy
          8       Soy

and a separate lookup table which contains the area for each watershed/land cover combination: 还有一个单独的查找表，其中包含每个分水岭/土地覆盖组合的面积：

WatershedID  Corn  Soy
          2    14    1
          3     2   14
          5    18    8
          7    21    2
          8     6   31

What I would like to do is to append a column to the first table which contains the corresponding row/column value in the lookup table, like so: 我想做的是向第一张表追加一个列，该列包含查找表中相应的行/列值，如下所示：

WatershedID LandCover   Area
          2      Corn     14
          8      Corn      6
          2       Soy      1
          8       Soy     31

I've managed to do this by iterating with a for loop: 我设法通过for循环进行迭代：

areas = []
for watershed_id, land_cover in tableA.iterrows():
    areas.append(tableB.loc[watershed_id][land_cover]

but given the size of my tables, this is slow. 但是鉴于我的桌子的大小，这很慢。 Is there a faster way to do this that doesn't involve iteration? 有没有一种不涉及迭代的更快方法？ I've been experimenting with MultiIndexing and pivot tables, but nothing has worked so far. 我一直在尝试使用MultiIndexing和数据透视表，但到目前为止没有任何效果。

Answer 1

You can use unstack with merge : 您可以将unstack与merge一起使用：

df3 = df2.set_index('WatershedID').unstack().reset_index()
df3.columns = ['LandCover','WatershedID','Area']
print (df3)
  LandCover  WatershedID  Area
0      Corn            2    14
1      Corn            3     2
2      Corn            5    18
3      Corn            7    21
4      Corn            8     6
5       Soy            2     1
6       Soy            3    14
7       Soy            5     8
8       Soy            7     2
9       Soy            8    31

print (pd.merge(df1,df3))
   WatershedID LandCover  Area
0            2      Corn    14
1            8      Corn     6
2            2       Soy     1
3            8       Soy    31

If there are more same columns you need specify columns for join: 如果有更多相同的列，则需要指定用于连接的列：

print (pd.merge(df1,df3, on=['WatershedID','LandCover']))
   WatershedID LandCover  Area
0            2      Corn    14
1            8      Corn     6
2            2       Soy     1
3            8       Soy    31

使用列标题进行熊猫查找/数据透视

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-11-02 14:01:34

使用列标题进行熊猫查找/数据透视

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-11-02 14:01:34

解决方案1
2 已采纳 2016-11-02 14:01:34