簡體   English   中英

如何在 pandas dataframe 中的列內操作列表名稱

[英]How can I Manipulate list names inside columns in pandas dataframe

我有一個 DataFrame:

               RR                    AA                  SS         LL
 C1     [C1, C2, C3, C4, C5]        [C1]                [C1]    
 C2     [C2, C3, C5]            [C1, C2, C3, C5]    [C5, C3, C2]    I
 C3     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    
 C4           [C4]              [C1, C3, C4, C5]        [C4]        I
 C5     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    

我想刪除具有 LL I的整行,即C2C4行還需要從RRAASS中的剩余行列表中刪除元素C2C4 ,這樣 output 應該是這樣的:

            RR               AA            SS         LL
 C1     [C1, C3, C5]        [C1]          [C1]  
 C3     [C3, C5]        [C1, C3, C5]    [C5, C3]    
 C5     [C3, C5]        [C1, C3, C5]    [C5, C3]    

我嘗試了這段代碼,但它只從RRAASS的列表元素中刪除了不是C2C4的行。

ix = df.RS.apply(set) == df.IS.apply(set)
df.loc[~ix]

我得到這樣的 output,其中RRAASSC2C4出現在我不需要的列表中。

               RR                    AA                  SS         LL
 C1     [C1, C2, C3, C4, C5]        [C1]                [C1]    
 C3     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    
 C5     [C2, C3, C4, C5]        [C1, C2, C3, C5]    [C5, C3, C2]    

這應該這樣做:

new_df = df.loc[df['LL'] != 'I', ['RR', 'AA', 'SS']].applymap(set).apply(lambda col: col - {'C2', 'C4'}).applymap(list)

Output:

>>> new_df
              RR            AA        SS
C1  {C5, C3, C1}          {C1}      {C1}
C3      {C5, C3}  {C1, C5, C3}  {C5, C3}
C5      {C5, C3}  {C1, C5, C3}  {C5, C3}
col1 = ['C1','C2','C3','C4','C5']
RR = [['C1', 'C2', 'C3', 'C4', 'C5'], ['C2', 'C3', 'C5'], ['C2', 'C3', 'C4', 'C5'], 
        ['C4'], ['C2', 'C3', 'C4', 'C5']]
AA = [['C1'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C3', 'C4', 'C5'], 
        ['C1', 'C2', 'C3', 'C5']]
SS = [['C1'], ['C5', 'C3', 'C2'], ['C5', 'C3', 'C2'], ['C4'], ['C5', 'C3', 'C2']]
LL = ['','I','','I','']

df1 = pd.DataFrame({'col1':col1, 'RR':RR,'AA':AA, 'SS':SS, 'LL':LL})

removing_row = df1.loc[df1['LL'] == 'I', 'col1']
removing_index = list(removing_row.index)
removing_values = removing_row.values

df1.drop(df1.index[removing_index], inplace=True, axis=0)

for col in ['RR','AA','SS']:
    for i,j in df1[col].iteritems():
        for k in removing_values:
            if k in j:
                j.remove(k)
        df1[col][i] = j

print(df1)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM