Assigning a value to a new Pandas Column on given condition

Question

I'm new to Pandas

I'm wanting to create a conditional column in Pandas. In RI could do this with Mutate but in Pandas.assign() it doesn't quite make sense to me.

What I want to do in Pseudo code is:

DataFrame.MyKeyColumn = If (DataFrame.Condtional is NaN) then:

concatenate[ DataFrame.keyfield1,"_",DataFrame.keyfield2,"_",DataFrame.keyfield3,"_",keyfield4] 
else:
concatenate[ DataFrame.keyfield1,"_",DataFrame.keyfield2,"_",DataFrame.condtionalfield,"_",DataFrame.keyfield3,"_",keyfield4]

in R you could do something like:

dplyr::mutate(Conditional = if(is.na(mycondtion)){paste(keyfield1,keyfield2)}, else {paste(keyfield1,condtionalfield,keyfield2)})

Example of my Current Data

Ideal End Goal

Any help would be really appreicated. I hope I'm just miss understanding how pandas.assign() works or I need to nest a few functions like pandas.where().

Answer 1

You can use numpy's where to set conditional boolean logic to fill in other columns, here's an example based on your pseudo code:

df.MyKeyColumn = np.where(df.Condtional.isna(),
df.keyfield1+"_"+df.keyfield2+"_"+df.keyfield3+"_"+keyfield4,
df.keyfield1+"_"+df.keyfield2+"_"+df.condtionalfield+"_"+df.keyfield3+"_"+keyfield4)

Here is a simplified example of usage:

import pandas as pd
import numpy as np

# Create a dummy dataframe
df = pd.DataFrame(data={"col1":[np.nan, 1, np.nan], "col2":[4, 5, 6]})

# Create a new column which fills in missing col1 values with data from col2
df["new_col"] = np.where(df["col1"].isna(), df["col2"], df["col1"])

# Create a new column which fills in missing col1 values with scalar value
df["new_col2"] = np.where(df["col1"].isna(), 7, df["col1"])

Assigning a value to a new Pandas Column on given condition

Question

1 answers

solution1
0 ACCPTED 2018-11-21 16:44:21

Assigning a value to a new Pandas Column on given condition

Question

1 answers

solution1 0 ACCPTED 2018-11-21 16:44:21

solution1
0 ACCPTED 2018-11-21 16:44:21