使用R中一列中存在的值的向量扩展data.frame

Question

I have a dataframe that looks something like this: 我有一个看起来像这样的数据框：

df <- data.frame(entrez = c(1:10, 1), entrez_HS = c(11:19, 19, 20))

   entrez entrez_HS
1       1        11
2       2        12
3       3        13
4       4        14
5       5        15
6       6        16
7       7        17
8       8        18
9       9        19
10     10        19
11      1        20

I also have vector of values that exist in df$entrez_HS : 我也有df$entrez_HS中存在的值的向量：

entrez_HS <- c(11, 11, 12, 19, 19)

For every value in entrez_HS , I want the row(s) of df where df$entrez_HS is equal to the value. 对于entrez_HS每个值，我想要df$entrez_HS等于该值的df行。 Duplicate entries in entrez_HS should result in duplicated rows. entrez_HS重复条目应导致重复行。 Here is the result that I would expect for the above df : 这是我期望上述df ：

   entrez entrez_HS
1      1        11
2      1        11
3      2        12
4      9        19
5     10        19
6      9        19
7     10        19

Not sure how to approach this? 不确定如何处理？ Thank you 谢谢

Answer 1

merge the data together: merge数据merge在一起：

merge(mget("entrez_HS"), df, by="entrez_HS")
#or
merge(data.frame(entrez_HS), df, by="entrez_HS")

#  entrez_HS entrez
#1        11      1
#2        11      1
#3        12      2
#4        19      9
#5        19     10
#6        19      9
#7        19     10

Answer 2

Without using any package, we can try this: 不使用任何软件包，我们可以尝试以下操作：

# Create data
df <- data.frame(entrez = c(1:10, 1), entrez_HS = c(11:19, 19, 20))
entrez_HS <- c(11, 11, 12, 19, 19)

# Extract information, then collect it
result <- lapply(entrez_HS, function(i) df[df$entrez_HS==i,])
result <- do.call("rbind", result)

Answer 3

Here is another option 这是另一种选择

 rbind(df[match(entrez_HS, df$entrez_HS),],
         df[duplicated(df$entrez_HS)|duplicated(df$entrez_HS, 
               fromLast=TRUE),])
#    entrez entrez_HS
#1        1        11
#1.1      1        11
#2        2        12
#9        9        19
#9.1      9        19
#91       9        19
#10      10        19

Or using dplyr 或使用dplyr

library(dplyr)
left_join(data_frame(entrez_HS), df)
#  entrez_HS entrez
#      <dbl>  <dbl>
#1        11      1
#2        11      1
#3        12      2
#4        19      9
#5        19     10
#6        19      9
#7        19     10

使用R中一列中存在的值的向量扩展data.frame

问题描述

3 个解决方案

解决方案1
4 已采纳 2016-05-16 02:33:38

解决方案2
1 2016-05-16 02:30:50

解决方案3
1 2016-05-16 02:35:47

使用R中一列中存在的值的向量扩展data.frame

问题描述

3 个解决方案

解决方案1 4 已采纳 2016-05-16 02:33:38

解决方案2 1 2016-05-16 02:30:50

解决方案3 1 2016-05-16 02:35:47

解决方案1
4 已采纳 2016-05-16 02:33:38

解决方案2
1 2016-05-16 02:30:50

解决方案3
1 2016-05-16 02:35:47