在 Pandas 我有一個 dataframe ，其中幾列定義了一個配置。我想識別具有相同配置的行

Question

df = pd.DataFrame({'id': [ 101, 102, 103, 104, 105, 106, 107 ],
                   'color': [ 'blue', 'blue', 'blue', 'red', 'blue', 'red', 'blue' ],
                   'location': ['there', 'here', 'there', 'here', 'here', 'there', 'here']})

df

輸出[12]：

    id color location
0  101  blue    there
1  102  blue     here
2  103  blue    there
3  104   red     here
4  105  blue     here
5  106   red    there
6  107  blue     here

我想創建一個按顏色和位置分組的列，如下所示：

    id color location group
0  101  blue    there     A
1  102  blue     here     B
2  103  blue    there     A
3  104   red     here     C
4  105  blue     here     B
5  106   red    there     D
6  107  blue     here     B

Answer 1

看起來像groupby().ngroup() ：

df['group'] = df.groupby(['color','location'], sort=False).ngroup()

Output：

    id color location  group
0  101  blue    there      0
1  102  blue     here      1
2  103  blue    there      0
3  104   red     here      2
4  105  blue     here      1
5  106   red    there      3
6  107  blue     here      1

Answer 2

我會做factorize

df[['color','location']].agg(','.join,1).factorize()[0]
Out[12]: array([0, 1, 0, 2, 1, 3, 1], dtype=int64)
#df['group']=df[['color','location']].agg(','.join,1).factorize()[0]

在 Pandas 我有一個 dataframe ，其中幾列定義了一個配置。我想識別具有相同配置的行

問題描述

2 個解決方案

解決方案1
2 已采納 2020-07-23 16:04:27

解決方案2
0 2020-07-23 16:05:26

在 Pandas 我有一個 dataframe ，其中幾列定義了一個配置。 我想識別具有相同配置的行

問題描述

2 個解決方案

解決方案1 2 已采納 2020-07-23 16:04:27

解決方案2 0 2020-07-23 16:05:26

在 Pandas 我有一個 dataframe ，其中幾列定義了一個配置。我想識別具有相同配置的行

解決方案1
2 已采納 2020-07-23 16:04:27

解決方案2
0 2020-07-23 16:05:26