[英]R: adding matching vector values from two dataframes in one colomn
I have a data frame which is configured roughly like this:我有一个大致如下配置的数据框:
df <- cbind(c('hello', 'yes', 'example'),c(7,8,5),c(0,0,0))
words字 | frequency频率 | count数数 |
---|---|---|
hello你好 | 7 7 | 0 0 |
yes是的 | 8 8 | 0 0 |
example例子 | 5 5 | 0 0 |
What I'm trying to do is add values to the third column from a different data frame, which is similiar but looks like this:我想要做的是从不同的数据框中向第三列添加值,这很相似,但看起来像这样:
df2 <- cbind(c('example','hello') ,c(5,6))
words字 | frequency频率 |
---|---|
example例子 | 5 5 |
hello你好 | 6 6 |
My goal is to find matching values for the first column in both data frames (they have the same column name) and add matching values from the second data frame to the third column of the first data frame.我的目标是在两个数据框中找到第一列的匹配值(它们具有相同的列名),并将第二个数据框中的匹配值添加到第一个数据框的第三列。
The result should look like this:结果应如下所示:
df <- cbind(c('hello', 'yes', 'example'),c(7,8,5),c(6,0,5))
words字 | frequency频率 | count数数 |
---|---|---|
hello你好 | 7 7 | 6 6 |
yes是的 | 8 8 | 0 0 |
example例子 | 5 5 | 5 5 |
What I've tried so far is:到目前为止我尝试过的是:
df <- merge(df,df2, by = "words", all.x=TRUE)
However, it doesn't work.但是,它不起作用。
I could use some help understanding how could it be done.我可以使用一些帮助来理解它是如何完成的。 Any help will be welcome.欢迎任何帮助。
This is an "update join".这是一个“更新加入”。 My favorite way to do it is in dplyr
:我最喜欢的方法是在dplyr
中:
library(dplyr)
df %>% rows_update(rename(df2, count = frequency), by = "words")
In base R you could do the same thing like this:在base R中,你可以做同样的事情:
names(df2)[2] = "count2"
df = merge(df, df2, by = "words", all.x=TRUE)
df$count = ifelse(is.na(df$coutn2), df$count, df$count2)
df$count2 = NULL
Here is an option with data.table
:这是data.table
的一个选项:
library(data.table)
setDT(df)[setDT(df2), on = "words", count := i.frequency]
Output输出
words frequency count
<char> <num> <num>
1: hello 7 6
2: yes 8 0
3: example 5 5
Or using match
in base R:或者在基础 R 中使用match
:
df$count[match(df2$words, df$words)] <- df2$frequency
Or another option with tidyverse
using left_join
and coalesce
:或者另一个使用left_join
tidyverse
coalesce
:
library(tidyverse)
left_join(df, df2 %>% rename(count.y = frequency), by = "words") %>%
mutate(count = pmax(count.y, count, na.rm = T)) %>%
select(-count.y)
Data数据
df <- structure(list(words = c("hello", "yes", "example"), frequency = c(7,
8, 5), count = c(0, 0, 0)), class = "data.frame", row.names = c(NA,
-3L))
df2 <- structure(list(words = c("example", "hello"), frequency = c(5, 6)), class = "data.frame", row.names = c(NA,
-2L))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.