如何在 R 中的循环中删除重复项

Question

我有一个循环，它遍历大量 .tsv 文件并将 function 到 output 结果运行到一个文件。 循环有效，但是 .tsv 文件的副本在其中一列中有重复值，这会阻止循环工作。 我需要删除列 V5 中具有重复值的行。 我已经尝试过此站点上解决的先前命令，但由于某种原因它们无法正常工作..

我的 input.tsv 文件看起来像这样（other_trait）

V1         V2         V3   V4    V5                    
10        201874235  G   T   rs389130213 

10        201876195  G   C   rs121467298 

10        201876295  T   A   rs121467298

我的代码开始如下格式化文件，然后运行 function。

files <- list.files(path =".", pattern = ".tsv")
files
datalist = list()
for(i in 1:length(files)) {  
  other_trait <- read.table(files[i])
  colnames(other_trait)[which(names(other_trait) == "V2")] <- "BP"
  other_trait<- merge(other_trait, subset_1[,c("BP","MAF")], by="BP")
  other_trait <- unique(other_trait$V5)

我已经尝试使用上面的 unique 以及other_trait <- other_trait[,(duplicated(other_trait$V5)), ] Unique 删除行 dataframe 中的其他值并且只保留 V5 中的唯一值，并且 !(duplicated) 没有似乎什么都做！

Answer 1

df <- read.table(text = "V1 V2 V3 V4 V5
10 201874235 G T rs389130213

10 201876195 G C rs121467298

10 201876295 T A rs121467298", h = T)

library(dplyr)
df %>% 
  rename(BP = V2) %>% 
  left_join(subset_1[,c("BP","MAF")], by="BP") %>% 
  distinct(V5, .keep_all = T)

如何在 R 中的循环中删除重复项

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-07-27 11:29:13

如何在 R 中的循环中删除重复项

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-07-27 11:29:13

解决方案1
0 已采纳 2022-07-27 11:29:13