简体   繁体   English

在 R 中,如何过滤数据框以仅包含具有 >=2 个非 NA 值的行?

[英]In R, How do I filter a data frame to only include rows with >=2 non-NA values?

Suppose I have a dataframe假设我有一个数据框

      Grp1 Grp2 Grp3
Trt1    NA    1   NA
Trt2     2    3   NA
Trt3     4   NA    5

I'd like to filter this down to only include rows where the number of non-NA values is greater than some total (in this case 2).我想将其过滤为仅包含非 NA 值的数量大于某些总数(在本例中为 2)的行。 So for this example I would like a result:所以对于这个例子,我想要一个结果:

      Grp1 Grp2 Grp3
Trt2     2    3   NA
Trt3     4   NA    5

You could use rowSums() and is.na() to filter the dataframe.您可以使用rowSums()is.na()来过滤数据框。 This will coerce the values you are using to filter into a matrix (so it may have issues with very large dataframes), but it should do the trick.这将强制您用于过滤到矩阵中的值(因此它可能会遇到非常大的数据帧问题),但它应该可以解决问题。

df1[rowSums(!is.na(df1)) >= 2, ]
     Grp1 Grp2 Grp3
Trt2    2    3   NA
Trt3    4   NA    5

Data :资料

df1 <- read.table(header = T, text = "      Grp1 Grp2 Grp3
Trt1    NA    1   NA
Trt2     2    3   NA
Trt3     4   NA    5")

You can do it this way:你可以这样做:

count_na <- apply(data, 1, function(x) sum(is.na(x)))
data[count_na < 2,]

sample data:样本数据:

  col1 col2 col3
1    1    1   NA
2   NA   NA    2
3   NA    3    3

new output:新输出:

  col1 col2 col3
1    1    1   NA
3   NA    3    3

Another option:另外一个选择:

data[apply(data,1,function(x) sum(!is.na(x)) >= 2),]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R:如何组合具有相同id的数据帧的行并获取最新的非NA值? - R: How to combine rows of a data frame with the same id and take the newest non-NA value? 从数据帧中筛选出非NA条目,同时保留仅包含NA的行 - sieve out non-NA entries from data frame while retaining rows with only NA 索引 R 中的非 NA 值以子集化 R 中的新数据框 - Index non-NA values in R to subset a new data frame in R 计算数据框中的重复行和第一个非 NA 出现 - Count repeated rows and the first non-NA appearance in a data frame R:为一组列返回只有 1 个非 NA 值的行 - R: Return rows with only 1 non-NA value for a set of columns 对于数据帧中的每一行,将非 NA 值替换为 R 中之前的最大数量 - For each row in a data frame, replace Non-NA values with the previous maximum number up to that point in R 汇总数据帧以沿子集返回非NA值 - Summarize data frame to return non-NA values along subsets 如何获取每行的第一个非 NA 日期并将其作为新列添加到 r 下面的数据框中? - How to get the first non-NA date for each row and add it as a new column in the data frame below in r? 如何找到大型数据框的非 NA 值(样本大小)? - How to find non-NA values (sample size) of a large data frame? 如何按组 select 非 NA 值,除非只有 NA - How to select non-NA values by group unless only NAs
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM