[英]How to create unique code based on combination of two columns in python data frame
I am trying to create unique code based on combination of two columns.Any help would be great.我正在尝试根据两列的组合创建唯一代码。任何帮助都会很棒。 Thank you.
谢谢你。
import pandas as pd
data = [['ajay','AL'],['ajay','AB'],['ajay','AL'],['Alex','Ac'],['Alex','Ay'],['Alex','Ac'],['Alex','Ac'],['Bob','Ay'],['Clarke','cv']]
df = pd.DataFrame(data,columns=['Name','Cat'],dtype=float)
Input :输入 :
Name Cat
0 ajay AL
1 ajay AB
2 ajay AL
3 Alex Ac
4 Alex Ay
5 Alex Ac
6 Alex Ac
7 Bob Ay
8 Clarke cv
output:输出:
Name Cat code
0 ajay AL AJ_1
1 ajay AB AJ_2
2 ajay AL AJ_1
3 Alex Ac AL_1
4 Alex Ay AL_2
5 Alex Ac AL_1
6 Alex Ac AL_1
7 Bob Ay Bo_1
8 Clarke cv Cl_1
Thanks.谢谢。
you can simply create a column with a uuid like this:您可以简单地创建一个带有 uuid 的列,如下所示:
import uuid
df['code'] = df.apply(lambda x: uuid.uuid1(), axis=1)
or if you want to merge the two columns:或者如果你想合并这两列:
df['code'] = df.apply(lambda row: row.Name + row.Cat, axis=1)
I hope I understood your question right, this script will create column code
, where first two characters are from column 'Name'
and then number based on 'Cat'
column:我希望我能正确理解您的问题,此脚本将创建列
code
,其中前两个字符来自列'Name'
,然后是基于'Cat'
列的编号:
from itertools import count
from functools import lru_cache
from collections import defaultdict
d = defaultdict(lambda: count(1))
@lru_cache(maxsize=None)
def get_num(n, c):
return next(d[n])
df['code'] = df.apply(lambda x: '{}_{}'.format(x['Name'][:2].upper(), get_num(x['Name'], x['Cat'])), axis=1)
print(df)
Prints:印刷:
Name Cat code
0 ajay AL AJ_1
1 ajay AB AJ_2
2 ajay AL AJ_1
3 Alex Ac AL_1
4 Alex Ay AL_2
5 Alex Ac AL_1
6 Alex Ac AL_1
7 Bob Ay BO_1
8 Clarke cv CL_1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.