如何根据python数据框中两列的组合创建唯一代码

Question

I am trying to create unique code based on combination of two columns.Any help would be great.我正在尝试根据两列的组合创建唯一代码。任何帮助都会很棒。 Thank you.谢谢你。

import pandas as pd
data = [['ajay','AL'],['ajay','AB'],['ajay','AL'],['Alex','Ac'],['Alex','Ay'],['Alex','Ac'],['Alex','Ac'],['Bob','Ay'],['Clarke','cv']]
df = pd.DataFrame(data,columns=['Name','Cat'],dtype=float)

Input :输入：

    Name Cat
0    ajay  AL
1    ajay  AB
2    ajay  AL
3    Alex  Ac
4    Alex  Ay
5    Alex  Ac
6    Alex  Ac
7     Bob  Ay
8  Clarke  cv

output:输出：

           Name Cat  code
0    ajay  AL  AJ_1
1    ajay  AB  AJ_2
2    ajay  AL  AJ_1
3    Alex  Ac  AL_1
4    Alex  Ay  AL_2
5    Alex  Ac  AL_1
6    Alex  Ac  AL_1
7     Bob  Ay  Bo_1
8  Clarke  cv  Cl_1

Thanks.谢谢。

Answer 1

you can simply create a column with a uuid like this:您可以简单地创建一个带有 uuid 的列，如下所示：

import uuid    
df['code'] = df.apply(lambda x: uuid.uuid1(), axis=1)

or if you want to merge the two columns:或者如果你想合并这两列：

df['code'] = df.apply(lambda row: row.Name + row.Cat, axis=1)

Answer 2

I hope I understood your question right, this script will create column code , where first two characters are from column 'Name' and then number based on 'Cat' column:我希望我能正确理解您的问题，此脚本将创建列code ，其中前两个字符来自列'Name' ，然后是基于'Cat'列的编号：

from itertools import count
from functools import lru_cache
from collections import defaultdict

d = defaultdict(lambda: count(1))

@lru_cache(maxsize=None)
def get_num(n, c):
    return next(d[n])

df['code'] = df.apply(lambda x: '{}_{}'.format(x['Name'][:2].upper(), get_num(x['Name'], x['Cat'])), axis=1)
print(df)

Prints:印刷：

     Name Cat  code
0    ajay  AL  AJ_1
1    ajay  AB  AJ_2
2    ajay  AL  AJ_1
3    Alex  Ac  AL_1
4    Alex  Ay  AL_2
5    Alex  Ac  AL_1
6    Alex  Ac  AL_1
7     Bob  Ay  BO_1
8  Clarke  cv  CL_1

如何根据python数据框中两列的组合创建唯一代码

问题描述

2 个解决方案

解决方案1
1 2020-08-24 11:41:30

解决方案2
1 已采纳 2020-08-24 12:21:24

如何根据python数据框中两列的组合创建唯一代码

问题描述

2 个解决方案

解决方案1 1 2020-08-24 11:41:30

解决方案2 1 已采纳 2020-08-24 12:21:24

解决方案1
1 2020-08-24 11:41:30

解决方案2
1 已采纳 2020-08-24 12:21:24