[英]R: Subset factor levels that co-occur with two levels from another factor
我有一個由多列組成的數據框。 我想對數據框進行子集化,以僅包含一個因素的水平與另一個因素的多個水平同時出現的行。 對於下面的簡化數據示例,我將只剩下前兩行,即 GeneA、GeneA 和 TissueA TissueB。
A <- c("GeneA","GeneA","GeneB","GeneB","GeneC","GeneC")
B <- c("TissueA","TissueB","TissueA","TissueA","TissueA","TissueA")
df <- data.frame(Gene = A, Tissue = B)
提前致謝。
這是一個想法。 您使用Gene
定義組。 在每個組中,您要檢查是否有多個唯一值。
group_by(df, Gene) %>%
filter(n_distinct(Tissue) >= 2)
Gene Tissue
<fct> <fct>
1 GeneA TissueA
2 GeneA TissueB
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.