如何根据行值组合两个不同长度的熊猫数据框

Question

I have the following two pandas dataframes:我有以下两个熊猫数据框：

dataframe #1:数据框 #1：

    user_id       animals    
0         1         'dog'
1         1         'cat'
2         1         'cow'
3         2         'dog'
4         2         'cat'
5         2         'cow'
...

dataframe #2: (column_D is not important in this task)数据框 #2：（column_D 在此任务中不重要）

    location     column_D
0       'CA'            1
1       'MA'            1
2       'AZ'            1
3       'CT'            1
...

I hope to create a new dataframe #3 based on #1 and #2:我希望基于 #1 和 #2 创建一个新的数据框 #3：

dataframe #3:数据框 #3：

    user_id       animals       location
0         1         'dog'           'MA'
1         1         'cat'           'MA'
2         1         'cow'           'MA'
3         2         'dog'           'AZ'
4         2         'cat'           'AZ'
5         2         'cow'           'AZ'
...

The first and second columns of dataframe #3 are identical to dataframe #1.数据帧#3 的第一列和第二列与数据帧#1 相同。 For the third column, I hope to assign a location based on its user_id and the index in dataframe #2.对于第三列，我希望根据其 user_id 和数据帧 #2 中的索引分配一个位置。 For example, for row 0 in dataframe #3, since its user_id = 1, I will check the location in dataframe #2 with index = 1, then assign that location ('MA' in this example) to the user.例如，对于数据帧 #3 中的第 0 行，由于其 user_id = 1，我将检查数据帧 #2 中索引 = 1 的位置，然后将该位置（在本例中为“MA”）分配给用户。

I've search for examples that use functions such as concat, map, merge, but couldn't find similar examples to this case.我搜索了使用 concat、map、merge 等函数的示例，但找不到与此案例类似的示例。 Is there a way to achieve this task?有没有办法完成这个任务？

Thank you so much!非常感谢！

Answer 1

Try map :尝试地图：

df["location"] = df1.user_id.map(df2.location)

    user_id animals location
0      1    'dog'   'MA'
1      1    'cat'   'MA'
2      1    'cow'   'MA'
3      2    'dog'   'AZ'
4      2    'cat'   'AZ'
5      2    'cow'   'AZ'

Answer 2

Based on your question, I've had to make the assumption that you want to drop column_D after the merge.根据您的问题，我不得不假设您要在合并后删除 column_D。 The following code works.以下代码有效。

#Import Pandas
import pandas as pd

#Create dataframes: df1 and df2
df1 = pd.DataFrame({'user_id':{0: 1, 1:1, 2:1, 3:2, 4:2, 5:2}, 'animal':{0:'dog', 1:'cat', 2:'cow', 3:'dog', 4:'cat', 5:'cow'}})
df2 = pd.DataFrame({'location':{0:'CA', 1:'MA', 2:'AZ', 3:'CT'}, 'column_D':{0:1, 1:1, 2:1, 3:1}})
    
#Combine matching items from df2 to df1 by focus on 'user_id'column and drop column_D
df3 = df1.join(df2, on='user_id').drop('column_D', 1)

#Display new dataframe
df3

如何根据行值组合两个不同长度的熊猫数据框

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-09-14 00:34:09

解决方案2
0 2020-09-14 01:57:34

如何根据行值组合两个不同长度的熊猫数据框

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-09-14 00:34:09

解决方案2 0 2020-09-14 01:57:34

解决方案1
0 已采纳 2020-09-14 00:34:09

解决方案2
0 2020-09-14 01:57:34