简体   繁体   中英

Conditions based on multiple column values

I have a dataframe that looks like this.

数据框

I want to add a column 'Correct_entry' that returns 'True' or 'False' based on the combination of values in the current columns. In total I have 24 possible combinations that are correct. I give two examples:

1.

If df['Hazard_type'] == 'Drought' & df['Card_type'] == 'Red block' & If['Round'] == 1 & df['Scenario'] == 'scenario A' & df['Payment_type'] == 'One payment (lump sum)' 
If df['Hazard_type'] == 'Drought' & df['Card_type'] == 'Green block' & If['Round'] == 1 & df['Scenario'] == 'scenario A' & df['Payment_type'] == 'One payment (lump sum)' 

I have 24 different combinations of values that are all correct and should equal 'True'. All other combinations should equal to 'False'.

What would be the best way to go through this set of data? How can I combine all these different statements?

I hope it is clear.

edit: As requested, the data in text format.

    Hazard_type  Card_type   Round  Scenario    Payment_type
244 Drought      Green block    2   scenario B  Two payments (two consecutive sums)
643 Drought      Red block      4   scenario A  Two payments (two consecutive sums)
584 Drought      Red block      4   scenario A  One payment (lump sum)
242 Drought      Red block      2   scenario B  Two payments (two consecutive sums)
1039 Drought     Green block    6   scenario A  Two payments (two consecutive sums)
101 Flood        Red block      1   scenario A  Two payments (two consecutive sums)
  • generate data to match your DF
  • use np.select() to return the matched condition or np.nan for transparency
  • then simple to get to bool
s = 200
df = pd.DataFrame({"Hazard_type":np.random.choice(["Drought","Flood"],s),
             "Card_type":np.random.choice(["Red block","Green block"],s),
             "Round":np.random.randint(1,7,s),
             "Scenario":np.random.choice(["scenario A","scenario B"],s),
             "Payment_type":np.random.choice(["One payment (lump sum)","Two payments"],s)})

conditions = [
    # condition 0
((df['Hazard_type'] == 'Drought') & (df['Card_type'] == 'Red block') & (df['Round'] == 1) 
 & (df['Scenario'] == 'scenario A') & (df['Payment_type'] == 'One payment (lump sum)')
),    
    # condition 1
((df['Hazard_type'] == 'Drought') & (df['Card_type'] == 'Green block') & (df['Round'] == 1) & 
 (df['Scenario'] == 'scenario A') & (df['Payment_type'] == 'One payment (lump sum)')
)]

df = df.assign(Correct_case=np.select(conditions, [c for c in range(len(conditions))], np.nan),
          Correct_entry=lambda dfa: ~dfa.Correct_case.isna())

sample output of Correct_entry==True

Hazard_type Card_type Round Scenario Payment_type Correct_case Correct_entry
47 Drought Red block 1 scenario A One payment (lump sum) 0 True
50 Drought Red block 1 scenario A One payment (lump sum) 0 True
72 Drought Red block 1 scenario A One payment (lump sum) 0 True
104 Drought Red block 1 scenario A One payment (lump sum) 0 True
120 Drought Red block 1 scenario A One payment (lump sum) 0 True
170 Drought Red block 1 scenario A One payment (lump sum) 0 True
186 Drought Green block 1 scenario A One payment (lump sum) 1 True

I would approach this by creating a series of boolean masks depending on the scenerio

# boolean masks
drought_mask = df['Hazard_type'] == 'Drought'
red_block_mask = df['Card_type'] == 'Red block'
green_block_mask = df['Card_type'] == 'Green block'
round_1_mask = df['Round'] == 1
scenario_a_mask = df['Scenario'] == 'scenario A'
one_payment_mask = df['Payment_type'] == 'One payment (lump sum)'


# scenerio 1
scenario_1_mask = df.loc[(drought_mask & red_block_mask & scenario_a_mask) & (round_1_mask & one_payment_mask & one_payment_mask)]
# scenario 2
scenario_2_mask = df.loc[(drought_mask & green_block_mask & scenario_a_mask) & (round_1_mask & one_payment_mask & one_payment_mask)]

# combining scenerios
df['Correct_entry'] = df.loc[scenario_1_mask | scenario_2_mask]

You can create a mask for every scenario and then combine them in the final correct_entry column using an OR operator (|). This should return a True value for every row which is matches one of the 24 scenarios.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM