[英]mlr3 - how to remove incomplete observations using `mlr3` interface
Is it possible to remove incomplete observation within a task --- task <- TaskRegr$new("data", data, "y")
--- using mlr3
filters or pipeops?是否可以使用
mlr3
过滤器或 pipeops 删除任务中的不完整观察 --- task <- TaskRegr$new("data", data, "y")
---
I don't think there is a preprocessing operator for removing observations.我认为没有用于删除观察结果的预处理运算符。
What I would do is to use filter
method within a Task.我要做的是在任务中使用
filter
方法。
Example:例子:
t = tsk("pima")
ids = complete.cases(t$data())
# number of incomplete observations
sum(!ids)
t$filter(which(ids))
# number of incomplete observations
# should be zero now
ids = complete.cases(t$data())
sum(!ids)
complete.cases
gives a Boolean vector that indicates which rows contain complete observations (no NA's). complete.cases
给出一个 Boolean 向量,指示哪些行包含完整的观察结果(没有 NA)。 filter
subsets task's data by row ids provided in the parameter.通过参数中提供的行 ID
filter
子集任务的数据。 Row ids not given in the parameter are removed in-place.参数中未给出的行 ID 将被就地删除。
If you want to instead impute incomplete observations, there are a few imputation operators like PipeOpImputeConstant that impute features by a constant.如果您想代之以估算不完整的观察结果,可以使用一些估算运算符,例如 PipeOpImputeConstant,它们通过常数估算特征。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.