简体   繁体   中英

python - how to assign a special index to the data frame?

I have the follwoing dataframe:

Col1    Col2      Col3   
X       Apple      A 
Y       Orange     B
Y       Apple      B
X       Apple      B
X       Orange     B

I want to create a 4 digit number for creating a Index The logic is that when Col1 and Col2 is matched, the 4 digit number will be same as previous. The Index is created by combining Number and Col3

Expected output
Number  Col1    Col2      Col3   Index
0001    X       Apple      A     0001-A
0002    Y       Orange     B     0002-B 
0003    Y       Apple      B     0003-B
0001    X       Apple      B     0001-B
0004    X       Orange     B     0004-B

How can I achieve this?

first make a dictionary for the number part of index, using the concatenation of column 1 and 2, then you have the number part of index of all the rows, so simply concatenate the index with column 3.

Function to get the index number:

def get_index_number(row,index_dict):
   unique_name=row['col1']+"-"+row['col2']
   if unique_name not in index_dict:
      index_dict[unique_name]=row['number']
   return index_dict[unique_name]

Usage: assuming you already have column 'index' in dataframe (if not add it)

index_dict={}
for row in dataframe.iterrows():
   row['index']=get_index_number(row,index_dict)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM