I have two dataframes, the first as pairwise connections among values:
df1 = pd.DataFrame({'n1': [5,1,1,3,4,3,2,2],
'n2': [1,6,3,4,3,2,3,7]})
n1 n2
0 5 1
1 1 6
2 1 3
3 3 4
4 4 3
5 3 2
6 2 3
7 2 7
and the second as a representation of which values belongs to each group g
as:
df2 = pd.DataFrame({'n': [1,5,6,2,3,4,7,7],
'g': ['a','a','a','b','b','b','c','c']})
g n
0 a 1
1 a 5
2 a 6
3 b 2
4 b 3
5 b 4
6 c 7
7 c 7
I'm trying to map the dataframes in order to get:
n1 n2 g1 g2
0 5 1 a a
1 1 6 a a
2 1 3 a b
3 3 4 b b
4 4 3 b b
5 3 2 b b
6 2 3 b b
7 2 7 b c
So for each n1
and n2
, create two columns with the corresponding groups in df2
where each value belongs to.
So far I tried mapping with:
df1['g1'] = df1['n1'].map(df2['g'])
df1['g2'] = df1['n2'].map(df2['g'])
But actually this returns:
n1 n2 g1 g2
0 5 1 b a
1 1 6 a c
2 1 3 a b
3 3 4 b b
4 4 3 b b
5 3 2 b a
6 2 3 a b
7 2 7 a c
because it is mapping on df2.index
instead of the n
to g
pairs. Setting the index of df2
to g
:
df2.index = df2['g']
leads the following error:
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
drop the duplicates in df2
and you can then call map
:
In [58]:
df2 = df2.drop_duplicates()
df2
Out[58]:
g n
0 a 1
1 a 5
2 a 6
3 b 2
4 b 3
5 b 4
6 c 7
In [61]:
df1[['g1','g2']] = df1.apply(lambda x: x.map(df2.set_index('n')['g']))
df1
Out[61]:
n1 n2 g1 g2
0 5 1 a a
1 1 6 a a
2 1 3 a b
3 3 4 b b
4 4 3 b b
5 3 2 b b
6 2 3 b b
7 2 7 b c
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.