[英]Expand data.frame using vector of values present in one column in R
I have a dataframe that looks something like this: 我有一个看起来像这样的数据框:
df <- data.frame(entrez = c(1:10, 1), entrez_HS = c(11:19, 19, 20))
entrez entrez_HS
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16
7 7 17
8 8 18
9 9 19
10 10 19
11 1 20
I also have vector of values that exist in df$entrez_HS
: 我也有
df$entrez_HS
中存在的值的向量:
entrez_HS <- c(11, 11, 12, 19, 19)
For every value in entrez_HS
, I want the row(s) of df
where df$entrez_HS
is equal to the value. 对于
entrez_HS
每个值,我想要df$entrez_HS
等于该值的df
行。 Duplicate entries in entrez_HS
should result in duplicated rows. entrez_HS
重复条目应导致重复行。 Here is the result that I would expect for the above df
: 这是我期望上述
df
:
entrez entrez_HS
1 1 11
2 1 11
3 2 12
4 9 19
5 10 19
6 9 19
7 10 19
Not sure how to approach this? 不确定如何处理? Thank you
谢谢
merge
the data together: merge
数据merge
在一起:
merge(mget("entrez_HS"), df, by="entrez_HS")
#or
merge(data.frame(entrez_HS), df, by="entrez_HS")
# entrez_HS entrez
#1 11 1
#2 11 1
#3 12 2
#4 19 9
#5 19 10
#6 19 9
#7 19 10
Without using any package, we can try this: 不使用任何软件包,我们可以尝试以下操作:
# Create data
df <- data.frame(entrez = c(1:10, 1), entrez_HS = c(11:19, 19, 20))
entrez_HS <- c(11, 11, 12, 19, 19)
# Extract information, then collect it
result <- lapply(entrez_HS, function(i) df[df$entrez_HS==i,])
result <- do.call("rbind", result)
Here is another option 这是另一种选择
rbind(df[match(entrez_HS, df$entrez_HS),],
df[duplicated(df$entrez_HS)|duplicated(df$entrez_HS,
fromLast=TRUE),])
# entrez entrez_HS
#1 1 11
#1.1 1 11
#2 2 12
#9 9 19
#9.1 9 19
#91 9 19
#10 10 19
Or using dplyr
或使用
dplyr
library(dplyr)
left_join(data_frame(entrez_HS), df)
# entrez_HS entrez
# <dbl> <dbl>
#1 11 1
#2 11 1
#3 12 2
#4 19 9
#5 19 10
#6 19 9
#7 19 10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.