从另一个df为df $列分配值？

Question

Example: I have a df in which the first column is 示例：我有一个df，其中第一列是

dat <- c("A","B","C","A")

and then I have another df in which I have in the first column is: 然后我在第一列中有另一个df：

dat2[, 1]
[1] A B C
Levels: A B C

dat2[, 2]
[1] 21000 23400 26800

How can I add the values in the second df ( dat2 ) to the first df ( dat )? 如何将第二个df（ dat2 ）中的值添加到第一个df（ dat ）？ In the first df there are repetitions and I want that everytime there is an "A" it will add the corresponding value (21000) from the second df in a new column. 在第一个df中有重复，我希望每次有“A”时它会在新列中添加第二个df的相应值（21000）。

Answer 1

Generating reproducible dataframe... 生成可重现的数据帧......

dat1 <- data.frame(x1 = c("A","B","C","A"), stringsAsFactors = FALSE)
dat2 <- data.frame(x1 = c("A","B","C"),
                   x2 = c(21000, 23400, 26800), stringsAsFactors = FALSE)

Then use the match function. 然后使用match功能。

dat1$dat2_vals <- dat2$x2[match(dat1$x1, dat2$x1)]

It is important to transform your character columns to character type rather than factor type or the elements will not match. 将字符列转换为character类型而不是factor类型或元素不匹配非常重要。 I mention this due to the levels attribute in your dat2. 由于dat2中的levels属性，我提到了这一点。

Answer 2

A third option which I prefer is left_join from dplyr ... It seems to be faster than merge with large data frames. 我喜欢第三种选择left_join从dplyr ......这似乎是快于merge大数据帧。

require(dplyr)

dat1 <- data.frame(x1 = c("A","B","C","A"), stringsAsFactors = FALSE)
dat2 <- data.frame(x1 = c("A","B","C"),
                   x2 = c(21000, 23400, 26800), stringsAsFactors = FALSE)

dat1 <- left_join(dat1, dat2, by="x1")

Answer 3

Let's race large dataframes with microbenchmark , just for fun! 让我们用microbenchmark比赛大型数据帧，只是为了好玩！

create large dataframes 创建大型数据帧

dat1 <- data.frame(x1 = rep(c("A","B","C","A"), 1000), stringsAsFactors = FALSE)
dat2 <- data.frame(x1 = rep(c("A","B","C", "D"), 1000),
                   x2 = runif(1,0), stringsAsFactors = FALSE)

on your marks, get set, GO! 在你的标记，得到设置，GO！

library(microbenchmark)
mbm <- microbenchmark(
  left_join = left_join(dat1, dat2, by="x1"),
  merge = merge(dat1, dat2, by = "x1"),
  times = 20
)

Many, many seconds later.... left_join is MUCH faster for large dataframes. 很多很多秒钟后.... left_join 快得多大型dataframes。

Answer 4

Use merge function. 使用merge功能。

# Input data
dat  <- data.frame(ID = c("A", "B", "C", "A"))
dat2 <- data.frame(ID = c("A", "B", "C"), 
                   value = c(1, 2, 3))
# Merge two data.frames by specified column
merge(dat, dat2, by = "ID")
  ID value
1  A     1
2  A     1
3  B     2
4  C     3

从另一个df为df $列分配值？

问题描述

4 个解决方案

解决方案1
5 已采纳 2017-09-05 22:37:02

解决方案2
2 2017-09-06 00:08:08

解决方案3
2 2017-09-06 03:02:15

解决方案4
1 2017-09-05 22:38:49

从另一个df为df $列分配值？

问题描述

4 个解决方案

解决方案1 5 已采纳 2017-09-05 22:37:02

解决方案2 2 2017-09-06 00:08:08

解决方案3 2 2017-09-06 03:02:15

解决方案4 1 2017-09-05 22:38:49

解决方案1
5 已采纳 2017-09-05 22:37:02

解决方案2
2 2017-09-06 00:08:08

解决方案3
2 2017-09-06 03:02:15

解决方案4
1 2017-09-05 22:38:49