如何删除包含 NA 的数据框中的行，但为某些行保留例外？

Question

我的 df 大约有 17,000 行（基因）和 200 列（患者），我需要删除包含 NAs 的基因，但其中 12 个对我的分析很重要，因此我不会删除它们，而是删除任何患者对这 12 个基因中的任何一个都有 NA。

我将如何编码？ （找不到类似的问题，抱歉）

Answer 1

在您的问题和您期望的结果中包含一个玩具示例总是好的。 它可以帮助用户回答您的问题，而无需编写玩具示例。 我做了一个小例子，有 5 个病人，5 个重要的基因和 5 个不太重要的基因。

您可以分两步做您想做的事。 首先，我们使用colSums和is.na删除患者。 换句话说，我们只为重要的基因行（行 1:5）计算每列有多少NA 。 我们只保留NA数量为零的列。 然后我们简单地做一个na.omit来删除带有NA的基因。

#Example data:
df1 <-data.frame(matrix(sample(letters,50,replace=TRUE),ncol=5))
colnames(df1) <-paste0("patient",1:5)
rownames(df1) <-c(paste0("important",1:5),paste0("lessimportant",6:10))
df1[2,4] <-NA;df1[7,1] <-NA;df1[9,5] <-NA #add NA for example

df1
                patient1 patient2 patient3 patient4 patient5
important1             m        f        d        t        m
important2             t        v        j     <NA>        d
important3             s        n        h        t        p
important4             h        h        t        n        i
important5             x        t        c        r        p
lessimportant6         y        f        b        a        h
lessimportant7      <NA>        o        h        n        a
lessimportant8         o        g        o        l        x
lessimportant9         m        p        f        d     <NA>
lessimportant10        n        a        h        u        a

#to remove NAs according to your specifications
df1 <-df1[,colSums(is.na(df1[1:5,]))==0] # remove patients with NA in important genes
df1 <-na.omit(df1) #remove genes with NA

#result
df1
                patient1 patient2 patient3 patient5
important1             m        f        d        m
important2             t        v        j        d
important3             s        n        h        p
important4             h        h        t        i
important5             x        t        c        p
lessimportant6         y        f        b        h
lessimportant8         o        g        o        x
lessimportant10        n        a        h        a

如何删除包含 NA 的数据框中的行，但为某些行保留例外？

问题描述

1 个解决方案

解决方案1
-1 2017-04-11 21:49:41

如何删除包含 NA 的数据框中的行，但为某些行保留例外？

问题描述

1 个解决方案

解决方案1 -1 2017-04-11 21:49:41

解决方案1
-1 2017-04-11 21:49:41