[英]how to select rows from a dataframe1 in R where dataframe$1column is found somewhere in dataframe2$column
I need to create a new dataframe from rows from dataframe1, such that the value of dataframe1$column is a value found in dataframe2$colum 我需要从dataframe1中的行创建一个新的dataframe,以便dataframe1 $ column的值是在dataframe2 $ colum中找到的值
the dataframes are: 数据帧是:
y <- "name,number,lunch
joe,2,peaches
steve,5,hotdog
jon,7,clamroll
nick,11,sloppyJoe"
x <- "number,office
1,1b
2,1a
3,2s
4,4d
5,f4
6,f4
7,h3
8,g3
9,j7
10,d3
11,jk"
df1 <- read.csv(textConnection(df1), header=TRUE, sep=",", stringsAsFactors=FALSE)
df2 <- read.csv(textConnection(df2), header=TRUE, sep=",", stringsAsFactors=FALSE)
I have tried: 我努力了:
df3 <- df1[which(df1$number == df2$number), ]
to no avail. 无济于事。 How do I properly do this in R?
如何在R中正确执行此操作? I could write a perl script, but I have about 100 of these sets and don't want to create more temp files.
我可以编写一个perl脚本,但是我有大约100个这样的集合,并且不想创建更多的临时文件。
again, the %in%
trick : 同样,
%in%
技巧:
> df1[df1$number %in% df2$number,]
number office
2 2 1a
5 5 f4
7 7 h3
11 11 jk
For what it's worth, you can easily just do a merge if you want to combine them. 对于它的价值,如果要合并它们,可以轻松进行合并。 In this case I'd say that's the cleanest solution : gives you every office of the occuring employees, and matches them :
在这种情况下,我会说这是最干净的解决方案:为您提供现任员工的每个办公室,并为他们配备:
> merge(df1,df2)
number office name lunch
1 2 1a joe peaches
2 5 f4 steve hotdog
3 7 h3 jon clamroll
4 11 jk nick sloppyJoe
Check the help files of merge for more options, you can do a whole lot with it. 检查合并帮助文件中的更多选项,您可以使用它做很多事情。
Joris' answer is spot on. 乔里斯的答案是正确的。 The
merge()
command can also be useful for this type of stuff. merge()
命令对于这种类型的东西也很有用。 If you are familiar with SQL joins, you can draw parallels between most of the options in merge()
and the different join operations. 如果您熟悉SQL连接,则可以在
merge()
大多数选项和不同的连接操作之间绘制相似之处。
#Inner join
> merge(df1,df2)
number office name lunch
1 2 1a joe peaches
2 5 f4 steve hotdog
3 7 h3 jon clamroll
4 11 jk nick sloppyJoe
#Right join:
> merge(df1,df2, all.x = TRUE)
number office name lunch
1 1 1b <NA> <NA>
2 2 1a joe peaches
3 3 2s <NA> <NA>
4 4 4d <NA> <NA>
5 5 f4 steve hotdog
6 6 f4 <NA> <NA>
7 7 h3 jon clamroll
8 8 g3 <NA> <NA>
9 9 j7 <NA> <NA>
10 10 d3 <NA> <NA>
11 11 jk nick sloppyJoe
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.