Appending and Removing Repeated Values on a column of a Dataframe Pandas

Question

so I have a dataframe that I made via df4.append(df3,ignore_index= True) ; however, I am having some issues removing repeats in my column Gene_symbol while still keeping the values in case 1, 2 and 3. I have already tried df4.drop_duplicates(["Gene_Symbol"]) and various other methods, all of which tend to delete the other rows and with it my Data.

What I am getting is this:

         X       Case1       Case2       Case3       Gene_Symbol 
8026    8025    0.5326718   0.0000000   0.0000000   GAPDHS;TMEM147
32531   32530   0.0000000   0.5416982   0.0000000   GAPDHS;TMEM147
57051   57050   0.0000000   0.0000000   0.4821592   GAPDHS;TMEM147

What I would like to have is a dataframe below where my actual values are kept

     Case1       Case2       Case3       Gene_Symbol 
    0.5326718   0.5416982   0.4821592   GAPDHS;TMEM147

Thank you for your time!

Answer 1

You could try the following, if all Cases columns contain only one non zero values for each gene , this should work (assume you don't have the X column which looks like an index):

df.set_index('Gene_Symbol').stack()[lambda x: x != 0].unstack(level=1).reset_index()

#      Gene_Symbol     Case1       Case2       Case3
#0  GAPDHS;TMEM147  0.532672    0.541698    0.482159

Or:

df
#          X       Case1       Case2       Case3       Gene_Symbol
#8026   8025    0.532672    0.000000    0.000000    GAPDHS;TMEM147
#32531  32530   0.000000    0.541698    0.000000    GAPDHS;TMEM147
#57051  57050   0.000000    0.000000    0.482159    GAPDHS;TMEM147

df.drop('X', 1, inplace=True)

df.set_index('Gene_Symbol').stack()[lambda x: x != 0].unstack(level=1).reset_index()

#      Gene_Symbol     Case1       Case2       Case3
#0  GAPDHS;TMEM147  0.532672    0.541698    0.482159

Answer 2

How about

df = df.groupby('Gene_Symbol')['Case1', 'Case2', 'Case3'].sum().reset_index()

    Gene_Symbol     Case1       Case2       Case3
0   GAPDHS;TMEM147  0.532672    0.541698    0.482159

Appending and Removing Repeated Values on a column of a Dataframe Pandas

Question

2 answers

solution1
0 2017-05-17 21:43:00

solution2
0 ACCPTED 2017-05-17 22:15:51

Appending and Removing Repeated Values on a column of a Dataframe Pandas

Question

2 answers

solution1 0 2017-05-17 21:43:00

solution2 0 ACCPTED 2017-05-17 22:15:51

solution1
0 2017-05-17 21:43:00

solution2
0 ACCPTED 2017-05-17 22:15:51