简体   繁体   中英

pandas map dataframes columns

I have two dataframes, the first as pairwise connections among values:

df1 = pd.DataFrame({'n1': [5,1,1,3,4,3,2,2],
                    'n2': [1,6,3,4,3,2,3,7]})

   n1  n2
0   5   1
1   1   6
2   1   3
3   3   4
4   4   3
5   3   2
6   2   3
7   2   7

and the second as a representation of which values belongs to each group g as:

df2 = pd.DataFrame({'n': [1,5,6,2,3,4,7,7],
                        'g': ['a','a','a','b','b','b','c','c']})

   g  n
0  a  1
1  a  5
2  a  6
3  b  2
4  b  3
5  b  4
6  c  7
7  c  7 

I'm trying to map the dataframes in order to get:

   n1  n2  g1  g2
0   5   1   a   a   
1   1   6   a   a
2   1   3   a   b
3   3   4   b   b
4   4   3   b   b
5   3   2   b   b
6   2   3   b   b
7   2   7   b   c

So for each n1 and n2 , create two columns with the corresponding groups in df2 where each value belongs to.

So far I tried mapping with:

df1['g1'] = df1['n1'].map(df2['g'])
df1['g2'] = df1['n2'].map(df2['g'])

But actually this returns:

   n1  n2 g1 g2
0   5   1  b  a
1   1   6  a  c
2   1   3  a  b
3   3   4  b  b
4   4   3  b  b
5   3   2  b  a
6   2   3  a  b
7   2   7  a  c

because it is mapping on df2.index instead of the n to g pairs. Setting the index of df2 to g :

df2.index = df2['g']

leads the following error:

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

drop the duplicates in df2 and you can then call map :

In [58]:
df2 = df2.drop_duplicates()
df2

Out[58]:
   g  n
0  a  1
1  a  5
2  a  6
3  b  2
4  b  3
5  b  4
6  c  7

In [61]:
df1[['g1','g2']] = df1.apply(lambda x: x.map(df2.set_index('n')['g']))
df1

Out[61]:
   n1  n2 g1 g2
0   5   1  a  a
1   1   6  a  a
2   1   3  a  b
3   3   4  b  b
4   4   3  b  b
5   3   2  b  b
6   2   3  b  b
7   2   7  b  c

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM