[英]List of data frames with the same number of variables and delete duplicates inside one variable and do the same in the rest of the data frames
I have the following list of data frames and each data frame has 3 variables (a, b and c)我有以下数据框列表,每个数据框有 3 个变量(a、b 和 c)
my.list <- list(d1, d2, d3, d4)
Inside my data frame, I have duplicated strings in "a" and I want to delete the rows with duplicated values在我的数据框中,我在“a”中有重复的字符串,我想删除具有重复值的行
The current code i am using:我正在使用的当前代码:
my.listnew <- lapply(my.list, function(x) unique(x["a"]))
The problem i have with this code is that the other 2 columns "b" and "c" are gone and I want to keep them, while the duplicated rows are deleted这段代码的问题是其他 2 列“b”和“c”已经消失,我想保留它们,而重复的行被删除
Use duplicated
to remove the duplicated values in column a
while keeping other columns.使用
duplicated
删除列a
中的重复值,同时保留其他列。
my.listnew <- lapply(my.list, function(x) x[!duplicated(x$a), ])
Just for reference, tidyverse style of doing it-仅供参考,tidyverse 的做法-
set.seed(1)
my.list <- list(d1 = data.frame(a = sample(letters[1:3], 5, T),
b = rnorm(5),
c = rnorm(5)),
d2 = data.frame(a = sample(letters[1:3], 5, T),
b = rnorm(5),
c = rnorm(5)),
d3 = data.frame(a = sample(letters[1:3], 5, T),
b = rnorm(5),
c = rnorm(5)))
library(tidyverse)
map(my.list, ~ .x %>% filter(!duplicated(a)) )
#> $d1
#> a b c
#> 1 a 1.5952808 0.5757814
#> 2 c 0.3295078 -0.3053884
#> 3 b 0.4874291 0.3898432
#>
#> $d2
#> a b c
#> 1 b 0.2522234 0.3773956
#> 2 a -0.8919211 0.1333364
#>
#> $d3
#> a b c
#> 1 a -0.2357066 1.1519118
#> 2 c -0.4333103 -0.4295131
#> 3 b -0.6494716 1.2383041
Created on 2021-05-13 by the reprex package (v2.0.0)由代表 package (v2.0.0) 于 2021 年 5 月 13 日创建
If you also want to combine the dataframes in output you may use map_dfr
instead of map
in above如果您还想组合 output 中的数据帧,您可以在上面使用
map_dfr
而不是map
We can use subset
without any anonymous function我们可以使用没有任何匿名 function 的
subset
out <- lapply(my.list, subset, subset = !duplicated(a))
Or using data.table
with unique
或使用具有
unique
性的data.table
library(data.table)
out <- lapply(my.list, function(dat) unique(as.data.table(dat), by = 'a'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.