I have the follwoing dataframe:
Col1 Col2 Col3
X Apple A
Y Orange B
Y Apple B
X Apple B
X Orange B
I want to create a 4 digit number for creating a Index The logic is that when Col1 and Col2 is matched, the 4 digit number will be same as previous. The Index is created by combining Number and Col3
Expected output
Number Col1 Col2 Col3 Index
0001 X Apple A 0001-A
0002 Y Orange B 0002-B
0003 Y Apple B 0003-B
0001 X Apple B 0001-B
0004 X Orange B 0004-B
How can I achieve this?
first make a dictionary for the number part of index, using the concatenation of column 1 and 2, then you have the number part of index of all the rows, so simply concatenate the index with column 3.
Function to get the index number:
def get_index_number(row,index_dict):
unique_name=row['col1']+"-"+row['col2']
if unique_name not in index_dict:
index_dict[unique_name]=row['number']
return index_dict[unique_name]
Usage: assuming you already have column 'index' in dataframe (if not add it)
index_dict={}
for row in dataframe.iterrows():
row['index']=get_index_number(row,index_dict)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.