大熊猫：根据其他数据框创建数据框列

Question

If I have 2 dataframes like these two: 如果我有两个像这样的两个数据框：

import pandas as pd

df1 = pd.DataFrame({'Type':list('AABAC')})
df2 = pd.DataFrame({'Type':list('ABCDEF'), 'Value':[1,2,3,4,5,6]})

  Type
0    A
1    A
2    B
3    A
4    C

  Type  Value
0    A      1
1    B      2
2    C      3
3    D      4
4    E      5
5    F      6

I would like to add a column in df1 based on the values in df2. 我想基于df2中的值在df1中添加一列。 df2 only contains unique values, whereas df1 has multiple entries of each value. df2仅包含唯一值，而df1每个值都有多个条目。 So the resulting df1 should look like this: 因此，生成的df1应该如下所示：

  Type Value
0    A     1
1    A     1
2    B     2
3    A     1
4    C     3

My actual dataframe df1 is quite long, so I need something that is efficient (I tried it in a loop but this takes forever). 我的实际数据帧df1很长，因此我需要一些有效的东西（我在一个循环中尝试过，但这要花很多时间）。

Answer 1

You could create dict from your df2 with to_dict method and then map result to Type column for df1 : 您可以使用to_dict方法从df2创建dict ，然后map结果map到df1 Type列：

replace_dict = dict(df2.to_dict('split')['data'])

In [50]: replace_dict
Out[50]: {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6}

df1['Value'] = df1['Type'].map(replace_dict)

In [52]: df1
Out[52]:
  Type  Value
0    A      1
1    A      1
2    B      2
3    A      1
4    C      3

Answer 2

As requested I am posting a solution that uses map without the need to create a temporary dict: 根据要求，我发布了一个使用map的解决方案，而无需创建临时字典：

In[3]:
df1['Value'] = df1['Type'].map(df2.set_index('Type')['Value'])
df1

Out[3]: 
  Type  Value
0    A      1
1    A      1
2    B      2
3    A      1
4    C      3

This relies on a couple things, that the key values that are being looked up exist otherwise we get a KeyError and that we don't have duplicate entries in df2 otherwise setting the index raises InvalidIndexError: Reindexing only valid with uniquely valued Index objects 这依赖于以下InvalidIndexError: Reindexing only valid with uniquely valued Index objects正在查找的键值存在，否则我们将得到KeyError并且df2没有重复的条目，否则设置索引会引发InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Answer 3

Another way to do this is by using the label based indexer loc . 另一种方法是使用基于标签的索引器loc 。 First use the Type column as the index using .set_index , then access using the df1 column, and reset the index to the original with .reset_index : 首先使用Type列的索引使用.set_index ，然后访问使用df1列，指数恢复到原来用.reset_index ：

df2.set_index('Type').loc[df1['Type'],:].reset_index()

Either use this as your new df1 or extract the Value column: 将此用作新的df1或提取“ Value列：

df1['Value'] = df2.set_index('Type').loc[df1['Type'],:].reset_index()['Value']

大熊猫：根据其他数据框创建数据框列

问题描述

3 个解决方案

解决方案1
2 已采纳 2016-08-05 09:17:08

解决方案2
2 2017-10-23 09:28:49

解决方案3
0 2017-10-31 10:11:00

大熊猫：根据其他数据框创建数据框列

问题描述

3 个解决方案

解决方案1 2 已采纳 2016-08-05 09:17:08

解决方案2 2 2017-10-23 09:28:49

解决方案3 0 2017-10-31 10:11:00

解决方案1
2 已采纳 2016-08-05 09:17:08

解决方案2
2 2017-10-23 09:28:49

解决方案3
0 2017-10-31 10:11:00