I have this dataframe:
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'fuz', 'baz', 'fuz', 'coo'],
'B' : ['one', 'one', 'two', 'two',
'three', 'three', 'four', 'one']})
It looks like that:
A B
0 foo one
1 bar one
2 foo two
3 bar two
4 fuz three
5 baz three
6 fuz four
7 coo one
I would like to create a new column group
. A group aggregates combinations of unique values in columns A + B.
It looks at unique values for each column. Then is looks at values in the other column for elements already in the group.
The result would look like this:
A B group
0 foo one 1
1 bar one 1
2 foo two 1
3 bar two 1
4 fuz three 2
5 baz three 2
6 fuz four 2
7 coo one 1
In this example, we start at foo
in column A. All foo
will be in group1
. The associated values in B are one
and two
=> also in group1
.
The associated values of one
and two
in column A are foo
, bar
and coo
=> also in group1
.
The same principle gives us group2
.
What would be the best way to do it ?
Could this be what you were looking for, it is a bit hard-coded but has the desired output:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'fuz', 'baz', 'fuz', 'coo'],
'B' : ['one', 'one', 'two', 'two',
'three', 'three', 'four', 'one']})
g1 = df[df['A']=='foo']
df['group'] = np.where(df['A'].isin(g1['A'])|df['B'].isin(g1['B']),1,2)
adding to the answer posted by zipa,I think my code can do on all situations,for example,the data of df will be divided into 3 groups
df = pd.DataFrame({'A' : ['foo', 'bae', 'foo', 'bar',
'fuz', 'baz', 'fzz', 'coo'],
'B' : ['one', 'one', 'two', 'two',
'three', 'three', 'four', 'one']})
df['group'] = [None]*len(df)
i = 1
while True:
value = df[df['group'].isnull()].iloc[0, 0]
g1 = df[df['A']==value]
df['group']=np.where(df['A'].isin(g1['A'])|df['B'].isin(g1['B']),i,df['group'])
if not any(df['group'].isnull()):
break
i += 1
print(df)
the resule like this
A B group
0 foo one 1
1 bae one 1
2 foo two 1
3 bar two 1
4 fuz three 2
5 baz three 2
6 fzz four 3
7 coo one 1
Hope to help you
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.