繁体   English   中英

如何从 R 中的 dataframe 中删除带有 inf 的行?

[英]How can I remove rows with inf from my dataframe in R?

我有一个非常大的数据框(ICS_data),大约有 129 列(变量)和 5276 行。 一些行在单个或多个变量中包含 inf 值。 我已经使用 na.omit(df) 删除了带有 NA 和 NaN 的行,但它仍然给我错误。 当我在 SO 中搜索类似错误时,我得到了此代码ICS_data[is.finite(rowSums(ICS_data)),]作为可能的解决方案,但是当我在 dataframe 上运行它时,我仍然收到另一条错误消息> powerdata <- ICS_data[is.finite(rowSums(ICS_data)),] Error in rowSums(ICS_data): 'x' must be numeric 我检查了我的数据集,它们都是数字,除了我的参考变量是一个因素。 有人可以帮帮我吗?

    > sapply(ICS_data, class)
     R1.PA1.VH           R1.PM1.V          R1.PA2.VH           R1.PM2.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R1.PA3.VH           R1.PM3.V          R1.PA4.IH           R1.PM4.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R1.PA5.IH           R1.PM5.I          R1.PA6.IH           R1.PM6.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R1.PA7.VH           R1.PM7.V          R1.PA8.VH           R1.PM8.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R1.PA9.VH           R1.PM9.V         R1.PA10.IH          R1.PM10.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
    R1.PA11.IH          R1.PM11.I         R1.PA12.IH          R1.PM12.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
          R1.F              R1.DF            R1.PA.Z           R1.PA.ZH 
     "numeric"          "numeric"          "numeric"          "numeric" 
          R1.S          R2.PA1.VH           R2.PM1.V          R2.PA2.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R2.PM2.V          R2.PA3.VH           R2.PM3.V          R2.PA4.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R2.PM4.I          R2.PA5.IH           R2.PM5.I          R2.PA6.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R2.PM6.I          R2.PA7.VH           R2.PM7.V          R2.PA8.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R2.PM8.V          R2.PA9.VH           R2.PM9.V         R2.PA10.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R2.PM10.I         R2.PA11.IH          R2.PM11.I         R2.PA12.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R2.PM12.I               R2.F              R2.DF            R2.PA.Z 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R2.PA.ZH               R2.S          R3.PA1.VH           R3.PM1.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R3.PA2.VH           R3.PM2.V          R3.PA3.VH           R3.PM3.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R3.PA4.IH           R3.PM4.I          R3.PA5.IH           R3.PM5.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R3.PA6.IH           R3.PM6.I          R3.PA7.VH           R3.PM7.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R3.PA8.VH           R3.PM8.V          R3.PA9.VH           R3.PM9.V 
     "numeric"          "numeric"          "numeric"          "numeric" 
    R3.PA10.IH          R3.PM10.I         R3.PA11.IH          R3.PM11.I 
     "numeric"          "numeric"          "numeric"          "numeric" 
    R3.PA12.IH          R3.PM12.I               R3.F              R3.DF 
     "numeric"          "numeric"          "numeric"          "numeric" 
       R3.PA.Z           R3.PA.ZH               R3.S          R4.PA1.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R4.PM1.V          R4.PA2.VH           R4.PM2.V          R4.PA3.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R4.PM3.V          R4.PA4.IH           R4.PM4.I          R4.PA5.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R4.PM5.I          R4.PA6.IH           R4.PM6.I          R4.PA7.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R4.PM7.V          R4.PA8.VH           R4.PM8.V          R4.PA9.VH 
     "numeric"          "numeric"          "numeric"          "numeric" 
      R4.PM9.V         R4.PA10.IH          R4.PM10.I         R4.PA11.IH 
     "numeric"          "numeric"          "numeric"          "numeric" 
     R4.PM11.I         R4.PA12.IH          R4.PM12.I               R4.F 
     "numeric"          "numeric"          "numeric"          "numeric" 
         R4.DF            R4.PA.Z           R4.PA.ZH               R4.S 
     "numeric"          "numeric"          "numeric"          "numeric" 
    control_panel_log1 control_panel_log2 control_panel_log3 control_panel_log4 
     "numeric"          "numeric"          "numeric"          "numeric" 
    relay1_log         relay2_log         relay3_log         relay4_log 
     "numeric"          "numeric"          "numeric"          "numeric" 
    snort_log1         snort_log2         snort_log3         snort_log4 
     "numeric"          "numeric"          "numeric"          "numeric" 
        marker 
      "factor"

要删除具有Inf值的行,您可以使用:

ICS_data[rowSums(sapply(ICS_data[-ncol(ICS_data)], is.infinite)) == 0, ]

或使用dplyr

library(dplyr)
ICS_data %>% filter_at(-ncol(.), all_vars(is.finite(.)))

我们可以将代码分解成更小的步骤来理解它是如何工作的。

考虑这些数据。

data <- data.frame(a = 1:4, b = 2:5, c = letters[1:4], stringsAsFactors = TRUE)
data$b[2] <- Inf
data
#  a   b c
#1 1   2 a
#2 2 Inf b
#3 3   4 c
#4 4   5 d

首先,我们从data中删除最后一列。 我们删除它,因为最后一列是factor ,因为我们不想包含它来查找无限值。 所以我们只得到数字列。

data[-ncol(data)]

#  a   b
#1 1   2
#2 2 Inf
#3 3   4
#4 4   5

接下来使用sapply我们使用is.infinite在每一列中找出哪些值是无限的。 这将返回一个具有TRUE / FALSE值的矩阵。

sapply(data[-ncol(data)], is.infinite)

#         a     b
#[1,] FALSE FALSE
#[2,] FALSE  TRUE
#[3,] FALSE FALSE
#[4,] FALSE FALSE

我们可以使用rowSums对这些逻辑值求和。 这里TRUE被认为是 1, FALSE被认为是 0。

rowSums(sapply(data[-ncol(data)], is.infinite))
#[1] 0 1 0 0

使用这个我们知道第二行有 1 个无限值,我们需要删除它。 所以我们 select 行有 0 个无限值。

data[rowSums(sapply(data[-ncol(data)], is.infinite)) == 0, ]

#  a b c
#1 1 2 a
#3 3 4 c
#4 4 5 d

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM