[英]Keep unique elements of each vector in a list of vectors
I have a dataframe with 1.6 million rows and one of the columns is a list of character vectors.我有一个 dataframe 有 160 万行,其中一列是字符向量列表。
Each element of this list column looks as follows: c("A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61Q", "B05B")
.此列表列的每个元素如下所示:
c("A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61Q", "B05B")
。
I would like for it to be c("A61K","A61Q","B05B")
.我希望它是
c("A61K","A61Q","B05B")
。
Meaning I just want to keep the unique values.意思是我只想保留独特的价值。 This process should be repeated for each row.
应对每一行重复此过程。
I have tried this:我试过这个:
sapply(strsplit(try, "|", function(x) paste0(unique(x), collapse = ",")))
And solutions using for loops but it takes very long and R stops running.和使用 for 循环的解决方案,但它需要很长时间并且 R 停止运行。
Use unique
使用
unique
> string <- c("A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61Q", "B05B")
> unique(string)
[1] "A61K" "A61Q" "B05B"
You can handle it using unique()
within lapply()
:您可以在
lapply()
中使用unique()
处理它:
# example df with list column
dat <- data.frame(id = 1:2)
dat$x <- list(
c("A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61Q", "B05B"),
c("A62K", "A61K", "A61K", "A58J", "A61K", "A61K", "A61K", "A61K", "A61K", "A61K", "A61Q", "C97B")
)
dat
id x
1 1 A61K, A61K, A61K, A61K, A61K, A61K, A61K, A61K, A61K, A61K, A61Q, B05B
2 2 A62K, A61K, A61K, A58J, A61K, A61K, A61K, A61K, A61K, A61K, A61Q, C97B
# remove duplicates within list column by row
dat$x <- lapply(dat$x, unique)
dat
id x
1 1 A61K, A61Q, B05B
2 2 A62K, A61K, A58J, A61Q, C97B
To filter the data frame use duplicated
.要过滤数据框,请使用
duplicated
。
If this is your data如果这是你的数据
df
str data
1 A61K 1
2 A61K 23
3 A61K 4
4 A61K 3
5 A61K 1
6 A61K 23
7 A61K 4
8 A61K 3
9 A61K 1
10 A61K 23
11 A61Q 4
12 B05B 3
Apply filter using desired column使用所需的列应用过滤器
df[!duplicated(df$str), ]
str data
1 A61K 1
11 A61Q 4
12 B05B 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.