如果任何列包含特定字符串，則刪除行

Question

我試圖找出 R 中刪除包含特定字符串的行的最佳方法，在我的例子中是“no_data”。

我有來自外部來源的數據，這些數據將 na 歸為“no_data”

一個例子是這樣的：

 time  |speed  |wheels
1:00   |30     |no_data
2:00   |no_data|18
no_data|no_data|no_data
3:00   |50     |18

我想瀏覽數據並刪除任何列中包含此“no_data”字符串的每一行。 我在弄清楚這一點時遇到了很多麻煩。 我嘗試了 sapply、filter、grep 和三者的組合。 我絕不是 r 專家，所以可能只是我錯誤地使用了這些。 任何幫助將不勝感激。

Answer 1

我們可以使用rowSums創建一個邏輯vector和基於它的子集

df1[rowSums(df1 == "no_data")==0, , drop = FALSE]
#   time speed wheels
#4 3:00    50     18

數據

df1 <- structure(list(time = c("1:00", "2:00", "no_data", "3:00"), speed = c("30", 
"no_data", "no_data", "50"), wheels = c("no_data", "18", "no_data", 
"18")), .Names = c("time", "speed", "wheels"), class = "data.frame", 
row.names = c(NA, -4L))

Answer 2

您可以使用na.strings = 'no_data'讀取數據將它們設置為NA然后簡單地省略 NAs（或使用complete.cases ），即（使用@akrun 的數據集）

d1 <- read.table(text = 'time   speed  wheels
 1    1:00      30 no_data
            2    2:00 no_data      18
            3 no_data no_data no_data
            4    3:00      50      18', na.strings = 'no_data', h=TRUE)

d1[complete.cases(d1),]
#  time speed wheels
#4 3:00    50     18

#OR

na.omit(d1)
#  time speed wheels
#4 3:00    50     18

Answer 3

兩個dplyr選項：（使用來自此答案的Akrun 數據）

library(dplyr)

## using the newer across()

df1 %>% filter(across(everything(), ~ !grepl("no_data", .)))
#>   time speed wheels
#> 1 3:00    50     18

## with the superseded filter_all

df1 %>% filter_all(all_vars(!grepl("no_data", .)))
#>   time speed wheels
#> 1 3:00    50     18

警告：
這僅在您想刪除帶有該字符串的所有行時才有效。 如果你想用這個字符串獲取所有行， all_vars(grepl('no_data',.) （沒有! ）是不夠的：這只會獲取所有列都包含字符串的行。在這種情況下，使用filter_all(any_vars())代替。

Answer 4

akrun 的回答是快速、正確和簡單的 :) 但是如果你想讓你的生活更復雜，你也可以這樣做：

dat
     time   speed  wheels
1    1:00      30 no_data
2    2:00 no_data      18
3 no_data no_data no_data
4    3:00      50      18

dat$new <- apply(dat[,1:3], 1, function(x) any(x %in% c("no_data")))
dat <- dat[!(dat$new==TRUE),]
dat$new <- NULL

dat
  time speed wheels
4 3:00    50     18

如果任何列包含特定字符串，則刪除行

問題描述

4 個解決方案

解決方案1
9 已采納 2017-06-14 12:18:19

數據

解決方案2
8 2017-06-14 12:42:40

解決方案3
7 2018-04-16 12:15:24

解決方案4
3 2017-06-14 12:26:31

如果任何列包含特定字符串，則刪除行

問題描述

4 個解決方案

解決方案1 9 已采納 2017-06-14 12:18:19

數據

解決方案2 8 2017-06-14 12:42:40

解決方案3 7 2018-04-16 12:15:24

解決方案4 3 2017-06-14 12:26:31

解決方案1
9 已采納 2017-06-14 12:18:19

解決方案2
8 2017-06-14 12:42:40

解決方案3
7 2018-04-16 12:15:24

解決方案4
3 2017-06-14 12:26:31