[英]python - how to assign a special index to the data frame?
I have the follwoing dataframe:我有以下数据框:
Col1 Col2 Col3
X Apple A
Y Orange B
Y Apple B
X Apple B
X Orange B
I want to create a 4 digit number for creating a Index The logic is that when Col1 and Col2 is matched, the 4 digit number will be same as previous.我想创建一个 4 位数字来创建索引 逻辑是当 Col1 和 Col2 匹配时,4 位数字将与之前相同。 The Index is created by combining Number and Col3
索引是通过结合 Number 和 Col3 创建的
Expected output
Number Col1 Col2 Col3 Index
0001 X Apple A 0001-A
0002 Y Orange B 0002-B
0003 Y Apple B 0003-B
0001 X Apple B 0001-B
0004 X Orange B 0004-B
How can I achieve this?我怎样才能做到这一点?
first make a dictionary for the number part of index, using the concatenation of column 1 and 2, then you have the number part of index of all the rows, so simply concatenate the index with column 3.首先为索引的数字部分制作一个字典,使用第 1 列和第 2 列的连接,然后您就有了所有行的索引的数字部分,因此只需将索引与第 3 列连接起来。
Function to get the index number:获取索引号的函数:
def get_index_number(row,index_dict):
unique_name=row['col1']+"-"+row['col2']
if unique_name not in index_dict:
index_dict[unique_name]=row['number']
return index_dict[unique_name]
Usage: assuming you already have column 'index' in dataframe (if not add it)用法:假设您在数据框中已经有“索引”列(如果没有添加)
index_dict={}
for row in dataframe.iterrows():
row['index']=get_index_number(row,index_dict)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.