[英]how to remove rows in python data frame with condition?
I have the following data:我有以下数据:
df = df =
Emp_Name Leaves Leave_Type Salary Performance
0 Christy 20 sick 3000.0 56.6
1 Rocky 10 Casual kkkk 22.4
2 jenifer 50 Emergency 2500.6 '51.6'
3 Tom 10 sick Nan 46.2
4 Harry nn Casual 1800.1 '58.3'
5 Julie 22 sick 3600.2 'unknown'
6 Sam 5 Casual Nan 47.2
7 Mady 6 sick unknown Nan
Output:输出:
Emp_Name Leaves Leave_Type Salary Performance
0 Christy 20 sick 3000.0 56.6
1 jenifer 50 Emergency 2500.6 51.6
2 Tom 10 sick Nan 46.2
3 Sam 5 Casual Nan 47.2
4 Mady 6 sick unknown Nan
I want to delete records where there is datatype error in numerical columns(Leaves,Salary,Performance).我想删除数值列(Leaves、Salary、Performance)中存在数据类型错误的记录。
If numerical columns contains strings then that row show be deleted from data frame?如果数字列包含字符串,那么该行显示会从数据框中删除吗?
df[['Leaves','Salary','Performance']].apply(pd.to_numeric, errors = 'coerce')
but this will covert values to Nan.但这会将值隐藏到 Nan 中。
Let's start from a note concerning your sample data:让我们从有关您的示例数据的注释开始:
It contains Nan strings, which are not among strings automatically recognized as NaN s.它包含Nan字符串,这些字符串不在自动识别为NaN的字符串中。 To treat them as NaN , I read the source text with read_fwf , passing na_values=['Nan'] .
要将它们视为NaN ,我使用read_fwf读取源文本,传递na_values=['Nan'] 。
And now get down to the main task:现在开始主要任务:
Define a function to check whether a cell is acceptable:定义一个函数来检查一个单元格是否可以接受:
def isAcceptable(cell):
if pd.isna(cell) or cell == 'unknown':
return True
return all(c.isdigit() or c == '.' for c in cell)
I noticed that you accept NaN values.我注意到您接受NaN值。 You also a cell if it contains only unknown string, but you don't accept a cell if such word is enclosed between eg quotes.
你也是一个细胞,如果它仅包含未知的字符串,但如果这样的话是如引号引起来,你不接受一个细胞。
If you change your mind about what is / is not acceptable, change the above function accordingly.如果您改变了什么是/不可接受的想法,请相应地更改上述功能。
Then, to leave only rows with all acceptable values in all 3 mentioned columns, run:然后,要在所有 3 个提到的列中只留下具有所有可接受值的行,请运行:
df[df[['Leaves', 'Salary', 'Performance']].applymap(isAcceptable).all(axis=1)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.