How to combine two pandas dataframes with different length based on their row values

Question

I have the following two pandas dataframes:

dataframe #1:

    user_id       animals    
0         1         'dog'
1         1         'cat'
2         1         'cow'
3         2         'dog'
4         2         'cat'
5         2         'cow'
...

dataframe #2: (column_D is not important in this task)

    location     column_D
0       'CA'            1
1       'MA'            1
2       'AZ'            1
3       'CT'            1
...

I hope to create a new dataframe #3 based on #1 and #2:

dataframe #3:

    user_id       animals       location
0         1         'dog'           'MA'
1         1         'cat'           'MA'
2         1         'cow'           'MA'
3         2         'dog'           'AZ'
4         2         'cat'           'AZ'
5         2         'cow'           'AZ'
...

The first and second columns of dataframe #3 are identical to dataframe #1. For the third column, I hope to assign a location based on its user_id and the index in dataframe #2. For example, for row 0 in dataframe #3, since its user_id = 1, I will check the location in dataframe #2 with index = 1, then assign that location ('MA' in this example) to the user.

I've search for examples that use functions such as concat, map, merge, but couldn't find similar examples to this case. Is there a way to achieve this task?

Thank you so much!

Answer 1

Try map :

df["location"] = df1.user_id.map(df2.location)

    user_id animals location
0      1    'dog'   'MA'
1      1    'cat'   'MA'
2      1    'cow'   'MA'
3      2    'dog'   'AZ'
4      2    'cat'   'AZ'
5      2    'cow'   'AZ'

Answer 2

Based on your question, I've had to make the assumption that you want to drop column_D after the merge. The following code works.

#Import Pandas
import pandas as pd

#Create dataframes: df1 and df2
df1 = pd.DataFrame({'user_id':{0: 1, 1:1, 2:1, 3:2, 4:2, 5:2}, 'animal':{0:'dog', 1:'cat', 2:'cow', 3:'dog', 4:'cat', 5:'cow'}})
df2 = pd.DataFrame({'location':{0:'CA', 1:'MA', 2:'AZ', 3:'CT'}, 'column_D':{0:1, 1:1, 2:1, 3:1}})
    
#Combine matching items from df2 to df1 by focus on 'user_id'column and drop column_D
df3 = df1.join(df2, on='user_id').drop('column_D', 1)

#Display new dataframe
df3

How to combine two pandas dataframes with different length based on their row values

Question

2 answers

solution1
0 ACCPTED 2020-09-14 00:34:09

solution2
0 2020-09-14 01:57:34

How to combine two pandas dataframes with different length based on their row values

Question

2 answers

solution1 0 ACCPTED 2020-09-14 00:34:09

solution2 0 2020-09-14 01:57:34

solution1
0 ACCPTED 2020-09-14 00:34:09

solution2
0 2020-09-14 01:57:34