compare and delete columns on a dataframe

Question

Python:

I have a dataframe where some genres colums are duplicated. I would like to go mix the columns with similar genres and if they have "1" value keep that value.

For example 0genero_adventure has a "0" value and 1genero_adventure has a "1" value, so I´d like to keep the "1".

Not only for these example fut for the whole table(which continues with more duplicated genres columns)

Thanks in advance:)

Answer 1

I would store the genres, loop through them and if one of the columns is 1 then keep 1 else 0.

genres = ["action", "adventure"....]
for col in genres:
    df[col] = np.where(df["0genero_"+col]==1 or df["1genero_"+col]==1, 1, 0]

Drop the rest of the columns you don't need

Answer 2

If I understood your problem correctly, I think the below code should work for you perfectly. However one requirement would be for you to create a list with the name of the genres.

genre_list = ["genero_Adventure", "genero_Biography", "genero_Comedy"]  #Add all the genre names like this

Then this loop should do your job:

for genre in genre_list:
   genre_cols_list = []
   genre_cols_list = [col for col in df.columns if genre in col]    #Creates a list containing all the columns with the genre name

   df[genre] = df[genre_cols_list].max(axis= 1)   #Checks if there is a value of 1 at the row level and stores it in a new column with just the genre name
   df.drop(columns = genre_cols_list, axis = 1, inplace = True)   #Deletes all columns with the genre name

compare and delete columns on a dataframe

Question

2 answers

solution1
0 2021-04-18 15:35:51

solution2
0 ACCPTED 2021-04-18 15:54:07

compare and delete columns on a dataframe

Question

2 answers

solution1 0 2021-04-18 15:35:51

solution2 0 ACCPTED 2021-04-18 15:54:07

solution1
0 2021-04-18 15:35:51

solution2
0 ACCPTED 2021-04-18 15:54:07