简体   繁体   English

如何根据python数据框中两列的组合创建唯一代码

[英]How to create unique code based on combination of two columns in python data frame

I am trying to create unique code based on combination of two columns.Any help would be great.我正在尝试根据两列的组合创建唯一代码。任何帮助都会很棒。 Thank you.谢谢你。

import pandas as pd
data = [['ajay','AL'],['ajay','AB'],['ajay','AL'],['Alex','Ac'],['Alex','Ay'],['Alex','Ac'],['Alex','Ac'],['Bob','Ay'],['Clarke','cv']]
df = pd.DataFrame(data,columns=['Name','Cat'],dtype=float)

Input :输入 :

    Name Cat
0    ajay  AL
1    ajay  AB
2    ajay  AL
3    Alex  Ac
4    Alex  Ay
5    Alex  Ac
6    Alex  Ac
7     Bob  Ay
8  Clarke  cv

output:输出:

           Name Cat  code
0    ajay  AL  AJ_1
1    ajay  AB  AJ_2
2    ajay  AL  AJ_1
3    Alex  Ac  AL_1
4    Alex  Ay  AL_2
5    Alex  Ac  AL_1
6    Alex  Ac  AL_1
7     Bob  Ay  Bo_1
8  Clarke  cv  Cl_1

Thanks.谢谢。

you can simply create a column with a uuid like this:您可以简单地创建一个带有 uuid 的列,如下所示:

import uuid    
df['code'] = df.apply(lambda x: uuid.uuid1(), axis=1)

or if you want to merge the two columns:或者如果你想合并这两列:

df['code'] = df.apply(lambda row: row.Name + row.Cat, axis=1)

I hope I understood your question right, this script will create column code , where first two characters are from column 'Name' and then number based on 'Cat' column:我希望我能正确理解您的问题,此脚本将创建列code ,其中前两个字符来自列'Name' ,然后是基于'Cat'列的编号:

from itertools import count
from functools import lru_cache
from collections import defaultdict

d = defaultdict(lambda: count(1))

@lru_cache(maxsize=None)
def get_num(n, c):
    return next(d[n])

df['code'] = df.apply(lambda x: '{}_{}'.format(x['Name'][:2].upper(), get_num(x['Name'], x['Cat'])), axis=1)
print(df)

Prints:印刷:

     Name Cat  code
0    ajay  AL  AJ_1
1    ajay  AB  AJ_2
2    ajay  AL  AJ_1
3    Alex  Ac  AL_1
4    Alex  Ay  AL_2
5    Alex  Ac  AL_1
6    Alex  Ac  AL_1
7     Bob  Ay  BO_1
8  Clarke  cv  CL_1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM