Manipulating a column in pandas dataframe

Question

I have a pandas dataframe as below:

data = {'A' : [1,2,3], 
        'B' : [2,17,17], 
        'C1' : ["C1", np.nan,np.nan],
        'C2' : ["C2", "C2",np.nan]} 

# Create DataFrame 
df = pd.DataFrame(data)

Dataframe:

    A   B   C1  C2
0   1   2   C1  C2
1   2   17  NaN C2
2   3   17  NaN NaN

I am creating a variable "C" based on the below logic and code

If any of C's(C1, C2, C3..) has the value "C"= value from C's(C1, C2, C3....).

df['C'] = df.filter(regex='C\d+').stack().groupby(level=0).agg(','.join)

Result:

    A   B   C1  C2  C
0   1   2   C1  C2  C1,C2
1   2   17  NaN C2  C2
2   3   17  NaN NaN NaN

Now, I want to perform below logic

If "C" has more than 1 values(say C1, C2) for any row, create a new row and append 2nd value. So I want my output to look like below:

    A   B   C1  C2  C
0   1   2   C1  C2  C1
0   1   2   C1  C2  C2
1   2   17  NaN C2  C2
2   3   17  NaN NaN NaN

Answer 1

We can do it by use explode then concat

s=df.filter(regex='C\d+').stack().groupby(level=0).agg(list).explode().to_frame('C').join(df)
s=pd.concat([s,df[~df.index.isin(s.index)]],axis=0,join='outer',ignore_index=True,sort=False)
s
Out[62]: 
     C  A   B   C1   C2
0   C1  1   2   C1   C2
1   C2  1   2   C1   C2
2   C2  2  17  NaN   C2
3  NaN  3  17  NaN  NaN

Answer 2

you could do:

 df.merge(df.melt(['A','B'],value_name= 'C').dropna().drop('variable',axis = 1),how = "left")
   A   B   C1   C2    C
0  1   2   C1   C2   C1
1  1   2   C1   C2   C2
2  2  17  NaN   C2   C2
3  3  17  NaN  NaN  NaN

Answer 3

You can just df.explode(...) , try:

#please note I aggregate it into list, not string
df['C'] = df.filter(regex='C\d+').stack().groupby(level=0).agg(list)

df=df.explode("C")

Outputs:

   A   B   C1   C2    C
0  1   2   C1   C2   C1
0  1   2   C1   C2   C2
1  2  17  NaN   C2   C2
2  3  17  NaN  NaN  NaN

Manipulating a column in pandas dataframe

Question

3 answers

solution1
0 ACCPTED 2020-02-10 21:41:29

solution2
0 2020-02-10 21:52:09

solution3
0 2020-02-10 22:27:18

Manipulating a column in pandas dataframe

Question

3 answers

solution1 0 ACCPTED 2020-02-10 21:41:29

solution2 0 2020-02-10 21:52:09

solution3 0 2020-02-10 22:27:18

solution1
0 ACCPTED 2020-02-10 21:41:29

solution2
0 2020-02-10 21:52:09

solution3
0 2020-02-10 22:27:18