简体   繁体   中英

Extract value from list of values in one df column by comparing with other df column

Data Frame contains two columns.

| Extraction                       | Actual    |
| -------------------------------- | --------- |
| [1_CHECK_90,2_SAVE_43,3_GO_56]   | 2_SAVE    |
| [1_FIN_54,2_CHECK_22]            | 1_FIN_54  |
| [1_L_32,2_Y_79,4_X_66]           | 2_Y_79    |
| [5_T_88]                         | NA        |

Convert Extraction as Actual by comparing with numbers on left side in Extraction column.

def extract_actual(row):
    try:
        a =[]
        for i in row['Extraction']:
            for j in i:
                for k in j.split("_"): 
                    # print(k)
                    for l in row['Actual']:
                        if k == l:
                            a.append(j)
        return a
    except: 
        a =[]
        return a

I tried using above function. It's working fine but for Actual='NA' that was not returning none.

Can you try this,

import pandas as pd

df = pd.DataFrame({'Extraction': [['1_CHECK', '2_SAVE', '3_GO'],                     
['1_FIN', '2_CHECK'], ['1_L', '2_Y', '4_X'], ['5_T']], 
'Actual': ['2_SAVE', '1_FIN', '2_Y', None]})

# get equal values index true false values
tFdf = df[df.columns.difference(["Expected"])].eq(df["Actual"], axis=0) 

# Assign matched values  
df["Extraction"].loc[tFdf["Actual"]] = df["Actual"][tFdf["Actual"]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM