[英]How do I delete common elements from one column A that are present in another column B in Pandas?
How to delete common things (str, int, float) in one column that I also find in another column?如何删除一列中我也在另一列中找到的常见内容(str、int、float)?
Suppose I have in a dataframe :假设我有一个数据帧:
colA colBB
eat a nice icecream icecream
I love to walk a lot walk , to
the city Paris is super Paris, super
.
.
.
I would like to have this result :我想要这个结果:
colA colBB
eat a nice icecream
I love a lot walk , to
the city is Paris, super
.
.
.
And this applied to every row in a big pandas Df.这适用于大熊猫 Df 中的每一行。
I did lower the text and tokenized the sentences already but after that I am blocked for the application...我确实降低了文本并已经对句子进行了标记化,但在那之后我被应用程序阻止了......
Thank you谢谢
Try this尝试这个
code to make a df:制作df的代码:
df = pd.DataFrame({
'colA': ['eat a nice icecream', 'I love to walk a lot','the city Paris is super'],
'colB': ['icecream', 'walk , to', 'Paris, super']})
colA colB
0 eat a nice icecream icecream
1 I love to walk a lot walk , to
2 the city Paris is super Paris, super
code to get expected output:获得预期输出的代码:
df.apply(lambda x: ' '.join([y.strip() for y in x[0].split(' ') if y.strip() not in x[1].split(' ')]), axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.