简体   繁体   中英

How can I Manipulate list names inside columns in pandas dataframe

I have a DataFrame:

               RR                    AA                  SS         LL
 C1     [C1, C2, C3, C4, C5]        [C1]                [C1]    
 C2     [C2, C3, C5]            [C1, C2, C3, C5]    [C5, C3, C2]    I
 C3     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    
 C4           [C4]              [C1, C3, C4, C5]        [C4]        I
 C5     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    

I want to delete the entire row having LL I ie, rows C2 and C4 Also need to delete the elements C2 and C4 from the remaining rows lists in RR , AA and SS so that the output should be like this:

            RR               AA            SS         LL
 C1     [C1, C3, C5]        [C1]          [C1]  
 C3     [C3, C5]        [C1, C3, C5]    [C5, C3]    
 C5     [C3, C5]        [C1, C3, C5]    [C5, C3]    

I tried this code but it only deletes the rows not C2 and C4 from list elements in RR , AA and SS .

ix = df.RS.apply(set) == df.IS.apply(set)
df.loc[~ix]

I am getting output like this where in RR , AA and SS , C2 and C4 are present in their lists which I don't need.

               RR                    AA                  SS         LL
 C1     [C1, C2, C3, C4, C5]        [C1]                [C1]    
 C3     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    
 C5     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    

This should do it:

new_df = df.loc[df['LL'] != 'I', ['RR', 'AA', 'SS']].applymap(set).apply(lambda col: col - {'C2', 'C4'}).applymap(list)

Output:

>>> new_df
              RR            AA        SS
C1  {C5, C3, C1}          {C1}      {C1}
C3      {C5, C3}  {C1, C5, C3}  {C5, C3}
C5      {C5, C3}  {C1, C5, C3}  {C5, C3}
col1 = ['C1','C2','C3','C4','C5']
RR = [['C1', 'C2', 'C3', 'C4', 'C5'], ['C2', 'C3', 'C5'], ['C2', 'C3', 'C4', 'C5'], 
        ['C4'], ['C2', 'C3', 'C4', 'C5']]
AA = [['C1'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C3', 'C4', 'C5'], 
        ['C1', 'C2', 'C3', 'C5']]
SS = [['C1'], ['C5', 'C3', 'C2'], ['C5', 'C3', 'C2'], ['C4'], ['C5', 'C3', 'C2']]
LL = ['','I','','I','']

df1 = pd.DataFrame({'col1':col1, 'RR':RR,'AA':AA, 'SS':SS, 'LL':LL})

removing_row = df1.loc[df1['LL'] == 'I', 'col1']
removing_index = list(removing_row.index)
removing_values = removing_row.values

df1.drop(df1.index[removing_index], inplace=True, axis=0)

for col in ['RR','AA','SS']:
    for i,j in df1[col].iteritems():
        for k in removing_values:
            if k in j:
                j.remove(k)
        df1[col][i] = j

print(df1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM