簡體   English   中英

刪除pandas數據框中具有混合數據類型的所有行,這些數據類型包含多列中的特定字符串

[英]remove all rows in pandas dataframe with mixed data types that contain a specific string in multiple columns

如果某行的任何列中包含“ 9999-未知”,如何刪除數據框中的所有行?

我已經能夠找到解決方案,這些解決方案可以基於整個數據框中的值格式(字符串,數字等)刪除行,或者基於特定列中的值刪除行,或者從具有幾列的數據框中刪除行使用他們的名字。

是我找到的最接近的東西,但該解決方案對我而言不起作用,因為由於數量龐大(超過76列),我無法輸入所有列名。

以下是樣本數據集

pd.DataFrame.from_items([('RespondentId', ['1ghi3g','335hduu','4vlsiu4','5nnvkkt','634deds','7kjng']), ('Satisfaction - Timing', ['9-Excellent','9-Excellent','9999-Don\'t Know','8-Very Good','1-Very Unsatisfied','9999-Don\'t Know']),('Response Speed - Time',['9999-Don\'t Know','9999-Don\'t Know','9-Excellent','9-Excellent','9-Excellent','9-Excellent'])])

刪除包含“ 9999-未知”的4行之后,輸出應如下所示,這樣我就可以使用清理后的數據編寫一個新的Excel文件。

pd.DataFrame.from_items([('RespondentId', ['5nnvkkt','634deds']), ('Satisfaction - Timing', ['8-Very Good','1-Very Unsatisfied']),('Response Speed - Time',['9-Excellent','9-Excellent'])]) 

采用

In [677]: df[~(df == "9999-Don't Know").any(axis=1)]
Out[677]:
  RespondentId Satisfaction - Timing Response Speed - Time
3      5nnvkkt           8-Very Good           9-Excellent
4      634deds    1-Very Unsatisfied           9-Excellent

要么

In [683]: df[(df != "9999-Don't Know").all(axis=1)]
Out[683]:
  RespondentId Satisfaction - Timing Response Speed - Time
3      5nnvkkt           8-Very Good           9-Excellent
4      634deds    1-Very Unsatisfied           9-Excellent

如同

In [686]: df[~df.eq("9999-Don't Know").any(axis=1)]
Out[686]:
  RespondentId Satisfaction - Timing Response Speed - Time
3      5nnvkkt           8-Very Good           9-Excellent
4      634deds    1-Very Unsatisfied           9-Excellent

要么

In [687]: df[df.ne("9999-Don't Know").all(axis=1)]
Out[687]:
  RespondentId Satisfaction - Timing Response Speed - Time
3      5nnvkkt           8-Very Good           9-Excellent
4      634deds    1-Very Unsatisfied           9-Excellent

對於混合列類型,請參見@PiR的注釋df.astype(object)

In [695]: df[df.astype(object).ne("9999-Don't Know").all(axis=1)]
Out[695]:
  RespondentId Satisfaction - Timing Response Speed - Time
3      5nnvkkt           8-Very Good           9-Excellent
4      634deds    1-Very Unsatisfied           9-Excellent

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM