如何刪除 Pandas 中另一列 B 中存在的 A 列中的常見元素？

Question

如何刪除一列中我也在另一列中找到的常見內容（str、int、float）？

假設我有一個數據幀：

colA                              colBB            
eat a nice icecream               icecream            
I love to walk a lot              walk , to          
the city Paris is super           Paris, super  
        .
        .
        .

我想要這個結果：

colA                    colBB          
eat a nice              icecream          
I love a lot            walk , to           
the city is             Paris, super 
        .
        .
        .

這適用於大熊貓 Df 中的每一行。

我確實降低了文本並已經對句子進行了標記化，但在那之后我被應用程序阻止了......

謝謝

Answer 1

嘗試這個

制作df的代碼：

df = pd.DataFrame({
    'colA': ['eat a nice icecream', 'I love to walk a lot','the city Paris is super'], 
    'colB': ['icecream', 'walk , to', 'Paris, super']})

    colA                      colB
0   eat a nice icecream       icecream
1   I love to walk a lot      walk , to
2   the city Paris is super   Paris, super

獲得預期輸出的代碼：

df.apply(lambda x: ' '.join([y.strip() for y in x[0].split(' ') if y.strip() not in x[1].split(' ')]), axis=1)

如何刪除 Pandas 中另一列 B 中存在的 A 列中的常見元素？

問題描述

1 個解決方案

解決方案1
1 2020-03-05 10:20:36

如何刪除 Pandas 中另一列 B 中存在的 A 列中的常見元素？

問題描述

1 個解決方案

解決方案1 1 2020-03-05 10:20:36

解決方案1
1 2020-03-05 10:20:36