[英]How to drop row from pandas data frame containing nan
I'm working on python 3.xI have a pandas data frame with only one column, student.At 501th row student contains nan
df.at[501,'student']
returns nan
To remove this I used following code我正在研究 python 3.x 我有一个 pandas 数据框只有一列,student.At 501th row student contains
nan
df.at[501,'student']
返回nan
要删除它,我使用了以下代码
df.at['student'].replace('', np.nan, inplace=True)
But after that I'm still getting nan
for df.at[501,'student']
I also tried this但在那之后,我仍然为
df.at[501,'student']
得到nan
我也试过这个
df.at['student'].replace('', np.nan, inplace=True)
But I'm using df in for loop to check value of student to apply some business logic but with inplace=True
I'm getting key error:501
Can you suggest me how do I remove the nan
& use df in for loop to check student value?但是我在 for 循环中使用 df 来检查学生的值以应用一些业务逻辑但是
inplace=True
我得到key error:501
你能建议我如何删除nan
并在 for 循环中使用 df 来检查学生价值?
Adding another answer since it's completely a different case.添加另一个答案,因为它是完全不同的情况。
I think you are not looping correctly on the dataframe, seems like you are looping relying on the index of the dataframe when you should probably loop on the items row by row or preferably use df.apply
.我认为您没有在 dataframe 上正确循环,似乎您在循环依赖 dataframe 的索引时,您可能应该逐行循环或最好使用
df.apply
。
If you still want to loop on the items and you don't care about the previous index, you can reset the index with df.reset_index(drop=True)
如果您仍然想循环项目并且您不关心以前的索引,您可以使用
df.reset_index(drop=True)
重置索引
df['student'].replace('', np.nan, inplace=True)
df['student'].dropna(inplace=True)
df = df.reset_index(drop=True)
# do your loop here
your problem is that you are dropping the item at index 501
then trying to access it, when you drop items pandas doesn't automatically update the index.您的问题是您将项目放在索引
501
处然后尝试访问它,当您删除项目时 pandas 不会自动更新索引。
the replace
function that you use would replace the first parameter with the second.您使用的
replace
function 将用第二个参数替换第一个参数。
if you want to replace the np.nan
with empty then you have to do如果你想用空替换
np.nan
那么你必须做
df['student'].replace(np.nan, '', inplace=True)
but this would not remove the row, it'd just replace it with an empty string, what you want is但这不会删除该行,它只是用一个空字符串替换它,你想要的是
df['student'].dropna(inplace=True)
but you gotta do this before looping over the elements, don't dropna
in the loop.但是你必须在循环元素之前这样做,不要在循环中
dropna
。
it'd be helpful to know what exactly you are doing in the loop知道你在循环中到底在做什么会很有帮助
One way to remove the rows that contains Nan values in the "student" column is删除“学生”列中包含 Nan 值的行的一种方法是
df = df[~df['student'].isnull()]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.